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INTRODUCTION 


Tae PREPARATION of this special number regarding psychological research 
in or for the Armed Forces was initiated by Alvin C. Eurich, as president 
of the American Educational Research Association, and the executive com- 
mittee of the association shortly after the end of World War II. A com- 
mittee composed of individuals representing several of the various groups 
of psychologists working on military problems was appointed. The com- 
mittee agreed that such a review and bibliography could be especially 
valuable if it were comprehensive and directed attention to the wealth of 
materials which had not been made generally available and to a large 
extent probably never would be published in the professional journals. 
It was also believed that this material could be best reviewed by those 
who actually participated in the research and were familiar with the 
essential background and conditions. 

Assignments for various chapters and sections were made by the com- 
mittee and a tentative schedule established. Unfortunately, the pressure 
of preparing official reports and the extensive personnel shifts during the 
period immediately after the war necessitated numerous changes both in 
scheduling and in the responsibilities for reviewing particular materials. 
In most cases the complete reports were only available in the official files 
and the committee was therefore dependent on obtaining the cooperation 
of a small number of individuals who were attempting to carry on the 
research work initiated during the war. Under these circumstances the 
assistance of the various collaborators is especially appreciated. The appre- 
ciation of the problems and the valuable assistance of the executive com- 
mittee have been important factors in the completion of this project. 


Joun C. FLANAGAN, Chairman 
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CHAPTER I 


General Reports of Research Programs 
for the Armed Forces 


JOHN C, FLANAGAN 


Derive the period of World War II a large amount of research on psy- 
chological and educational problems was conducted in and for the various 
armed services. Because of military security measures and the pressure of 
current military duties and problems only a small fraction of this research 
was published in professional journals or otherwise made generally avail- 
able during the war. The purpose of this review is to bring to the attention 
of research workers the nature and scope of the research studies conducted 
so that the experience and findings of the wartime studies may be sum- 
marized in a single source. 

One of the first groups to become actively engaged in military research 
in the period preceding the entry of this country into the war was the 
committee of the National Research Council established at the request of 
Dean R. Brimhall, Director of Research, Civil Aeronautics Administration. 
The research in aviation psychology of this group has been reported in 
a series of Research Reports published by the Civil Aeronautics Adminis- 
tration. The findings of this series of studies have been reviewed by Viteles 
(5) and are not included in the present survey. This group, of which M. S. 
Viteles is chairman, is continuing an active program of research. It is 
now known as the Committee on Aviation Psychology of the National 
Research Council. 

A number of those most active in the early stages of the program dis- 
cussed in the preceding paragraph entered the Navy after our entry into 
the war and an Aviation Psychology Branch was established under the 
direction of the late John G. Jenkins in the Bureau of Medicine and Surgery 
in the Navy Department in Washington. The reports of this group are 
reviewed by Ames and Older in Chapter II of this survey. This work is 
continuing in the same location under the direction of Lieutenant Harry 
J. Older. 

The research program in aviation psychology reviewed by Frederick B. 
Davis in Chapter III was initiated in the summer of 1941. The research 
results of this group have been reported in a series of nineteen research 
volumes under the general title of Army Air Forces Aviation Psychology 
Program Research Reports (3). The scope of this program has been 
expanded and it is continuing under the direction of Glen Finch, Acting 
Research Chief, Division of Human Resources, Office of Research and 
Development, Headquarters, United States Air Force. 

The Adjutant General’s Office in the War Department established a 
Personnel Procedures Section under the technical direction of Walter V. 
Bingham in the fall of 1940. The numerous research studies carried out 
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by this group have had very little circulation outside of the staff of this 
group. Therefore the review of these studies by Sisson in Chapter IV 
should be especially valuable in bringing to the reader’s attention work 
done under the supervision of Marion W. Richardson, Edwin R. Henry, 
and others who directed this work. This work is continuing under the 
direction of Donald E. Baier. 

The program on personnel research and test development in the Bureau 
of Naval Personnel did not get started until late in the fall of 1942. The 
organization was directed by Alvin C. Eurich initially. He was succeeded 
by Raymond Faulkner. The work is being continued under the direction 
of Eugene D. Carstater. The work of this group has been reported in a 
volume (4) edited by Dewey B. Stuit. It was originally planned that this 
material be reviewed by one of the group who worked in the program. 
This proved to be impossible. The reviews of the published reports of this 
group are therefore included in the miscellaneous chapter. 

Thruout the war a substantial amount of research on personnel problems 
was conducted for the Navy by the National Defense Research Committee 
thru its Applied Psychology Panel. John M. Stalnaker was chairman of 
the original committee set up to handle this work. He was succeeded by 
Walter S. Hunter when the panel was formed. This group contracted 
with various universities and other organizations to carry out specific 
research and development projects requested by the armed services. The 
reports of these groups are listed in a bibliography prepared by Bray (1). 
An official summary report has also been published in two volumes (7). 
One is on aptitude and classification, the other on training and equipment. 
Both are edited by Wolfle. Another more popularly written account of the 
work of these groups has been prepared by Bray (2). Plans for the review 
of this work by personnel participating in the program could not be carried 
out and the published reports of this group have also been included in 
the reviews of the miscellaneous materials. 

In addition to the work done under the supervision of the Applied Psy- 
chology Panel there was a substantial amount of research done by civilian 
organizations under other auspices. One of the largest of such programs 
was the work of the Psycho-Acoustic Laboratory at Harvard University 
under the direction of S. Smith Stevens. A review of the work of this group 
is given in Chapter VI. 

Another program including a number of psychologists was the service 
work in assessing candidates for assignments for the Office of Strategic 
Services. This has been reported in a recently published volume prepared 
by a group of staff members (6). 

One additional set of reports on psychological work done in the services 
during the war is to be published. This is an account of the work of the 
Morale Services Division. This work was initiated by Major General Fred- 
eric Osborn and was carried out under the immediate supervision of 
Samuel S. Stouffer and Carl I. Hovland. The four volumes reporting the 
findings of this group are expected to be available soon. 
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A number of psychologists rendered valuable services in many other 
connections during the war. Published reports of many of the studies 
done under their direction are briefly reviewed in Chapter V. It is believed 
that a small number of important research studies carried out for the 
services during World War II have been overlooked. However it is hoped 
that thru the many reports listed in this review, research workers will be 
able to benefit from most of the valuable studies carried out during this 
period. 
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CHAPTER It 


Aviation Psychology in the United States Navy 


VIOLA CAPREZ AMES and HARRY J. OLDER 


Tae wvestications reported in this chapter have been selected as repre- 
sentative of both the type and the scope of the work developed by the 
psychologists in the Aviation Psychology Branch, Bureau of Medicine and 
Surgery, Navy Department, under the direction of Captain John G. Jenkins. 
Much of the work of the Branch was of an advisory or applied nature 
which did not lend itself to written reports. Consequently, many aspects 
of the program are not in written form.* 

The development of the naval aviation psychology program up to and 
following the time of the establishment of the central office in October 
1942 may be read in several descriptive summaries (7, 8, 14, 15, 16, 
32, 40). Psychologists were originally commissioned to administer, score, 
and interpret tests for the selection of naval aviation cadets; however, the 
program soon broadened to include the development of experimental 
designs for research projects, statistical analyses, methods for selecting 
flight instructors and aircraft gunners, investigation of attrition, develop- 
ment of training aids, advisory aid to other bureaus, and research on vision 
and communication. 

The principal research groups in the naval aviation program were at 
Washington, D. C.; Pensacola, Florida; Corpus Christi, Texas; and Jack- 
sonville, Florida. The Washington group was primarily occupied with the 
administration of the program, the validation of the tests, the development 
of improved criteria, and consulting services. At Pensacola emphasis was 
on the investigation of problems of night vision training, disorientation, 
and intelligibility. Studies on fear and leadership were conducted at 
Corpus Christi. The Aviation Gunnery Group worked on the development 
of uniform curriculums for gunnery schools, improved grading systems, 
and tested special devices (7, 8, 27). 


Selection and Classification 


About a year and one half before the Pearl Harbor attack, work had begun 
on the validation of a group of tests for the selection of naval aviators. 
From the forty different tests investigated, three were selected. Each of 
these three tests was validated on groups of over 3000 cadets (44). 

The three tests originally used were the Wonderlic Personnel Test (PT), 
the Mechanical Comprehension Test (MCT), and the Biographical Inven- 
tory. In October 1942 the Wonderlic Personnel Test was replaced by the 


* The statements contained herein are the personal interpretations of the writers and are not to be 
construed as reflecting the views of the Navy Department or the naval service at large. 
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Aviation Classification Test (ACT). Two forms of this test were developed 
by the members of the Aviation Psychology Branch in such a manner as 
to give maximal spread and maximal reliability in the region of the cutting 
score (31, 23). Both forms had estimated odd-even reliabilities of over .92. 

Early investigations indicated the ability of the Wonderlic Personnel 
Test to discriminate between trainees who pass or fail (in aviation train- 
ing) among low-score groups. However, this was not true for the middle 
and upper score-range grou,s. The Personnel Test was found to be most 
valuable for predicting ground-school failures (6). Like the Personnel Test, 
the Aviation Classification Test was found to predict academic failures 
(ground-school training) fairly well, but to be of no value in predicting 
flight-training failures. Biserial correlations of .29 and .38 are reported 
for the Aviation Classification Test based on all entrants into training 
versus ground-school training failures. 

New forms of the Mechanical Comprehension Test were developed by 
the Psychological Corporation for use in the naval aviation selection 
program. The estimated odd-even reliability was .80. The test-retest coeffi- 
cients varied from .84 to .87. That the Mechanical Comprehension Test 
predicted failures for both flight- and academic-training groups is evi- 
denced by the biserial correlations presented by Fiske (6). These range 
from .14 to .43 for flight training and from .15 to .48 for ground-school 
training. 

The Biographical Inventory is a questionnaire with items on biographical 
information, interests, habits, and attitudes (6, 14, 38, 41). It was originally 
developed for use in the selection of civilian pilots, but was later adapted 
to naval aviation selection. 

The test-retest reliability was approximately .70 for a group of almost 
2000 men. The biserial correlations for the Biographical Inventory reported 
by Fiske (6) range from .15 to .40 for flight-training failures, .06 to .28 
for ground-school failures, and .21 to .36 for all failures. 

One of the most significant technical advances made in 1942 by the 
Aviation Psychology Branch was the introduction of a single index to 
represent various combinations of test scores. This index, called the Flight 
Aptitude Rating (F AR), combined the grades on the MCT and the BI (14). 

Originally, a table was constructed to show the percent of failures among 
men obtaining each of the possible combinations of BJ and MCT scores 
(18). Cells with similar percents were grouped into one of five categories 
of progressively high failure rates. Later, the scale was divided into nine 
steps to permit finer discriminations. The biserial correlation between 
pass-fail groups and the FAR was .43. Since this value was exactly the 
same as the multiple R between pass-fail and the B/ and MCT, it indicated 
that the FAR made the maximum use of differentiations provided by the 
tests (6). 

Early in the program it was found that age correlated with outcome of 
training. The younger cadets were more likely to graduate than the older 
ones. It was also evident that extent of previous flight training predicted 
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outcome of training, but not as well as either the BJ or MCT. As for edu- 
cation, cadets with no previous flight training and less than two years of 
college showed a significantly higher percent of failures than those with 
no previous training but at least two years of college (6). 

Success in the development of technics of selection for naval aviation 
cadets suggested the feasibility of similar technics for the selection of 
flight instructors. Technical Memorandum No. 7 (39) outlined the ap- 
proach to the project. The steps, in order, were: (a) to identify two 
groups of flight instructors representing the “tails” of the distribution of 
instructor ability; (b) to determine specific characteristics which dis- 
criminate between these extremes; (c) to develop a scoring key and check 
its validity. The tests used were: PT, MCT, BI, the Aviation Preference 
Check List, the Opinions on Flight Instruction Inventory, and the Aviation 
Experience Record. The last three tests were developed expressly for this 
study. Data were completed on 905 instructors. Five types of criteria were 
established. As a result of this study the /nstructor Aptitude Rating Scale 
was devised for the selection of instructors. 

Trumbull and Vinacke (29) reported an evaluation of the Diagnostic 
Scale for Rating Flight Instructors. The scale was composed of thirty-five 
items in terms of which a student was asked to assess the merit of his 
instructor. Thirty-four instructors from two squadrons were used in the 
trial groups. The results indicated that the five degrees along the scale 
were far from equal for all questions; several questions were unsatisfactory 
in terms of consistency of the scale, but the majority of the items were 
relevant. A revised questionnaire was developed as a result of the study. 

An analysis of flight instructor selection technics was reported by 
Trumbull and Vinacke (28). Well-defined criterion groups of “good” and 
“poor” instructors were compared. Differences between the groups on com- 
ponents of selectiori tests were evaluated with the conclusion that the type 
of material used in these tests was of value in selecting instructors, but a 
majority of the items did not give the best prediction for the population 
used in this study. 

The Pensacola group worked on questionnaires for selection for 
advanced training. Many different criteria were used for selection purposes. 
One of these, low pressure tolerance, was eliminated after completion of 
Research Project R7-2 on classification tests in low pressure chamber (27). 


Training 


The aviation psychologists who were attached to the Naval Air Training 
Commands were engaged in a variety of training projects. Among their 
contributions were: (a) the development and introduction of improved 
training records, forms, and procedures; (b) aid in the preparation, 
evaluation, and revision of syllabi and training manuals for both flight- 
and ground-school instruction; (c) the improvement of testing methods 
and grading procedures; (d) statistical analyses of such factors as student 
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flow, causes of attrition, and comparison of records from different training 
stations (7, 8, 14, 15, 16, 27). 

Considerable work was done on the standardization of flight instructor’s 
vocabulary as one of the basic problems of naval aviation training. A 
technic was developed which permitted sound recordings of all conver- 
sations between instructor and student during an instructional flight. The 
apparatus consisted of a two-way electrical interphone which also served 
as a modulator for a light-weight high-frequency transmitter. Thru this 
device it was possible to “listen in” and make recordings on the ground 
of conversations in the air. These conversations were typed and studied 
in detail. From the results the “Patter” book for flight instruction was 
written (19). 

At Pensacola various analyses of attrition were made. Among the 
reports are: “Analysis of Attrition Trends in Aviation Cadets,” “Chrono- 
logical Analysis of Requests To Be Dropped from Training,” and “Analysis 
of Attrition—Primary Land Planes” (27). 

A second major function of the Pensacola group was the investigation 
of visual problems in naval aviation training. Studies of night vision test- 
ing instruments, new color testing devices, and night vision training pro- 
cedures were carried out. 

A preliminary report on “Loss of Visual Contrast Discrimination” in- 
cludes the following statement: “Loss of visual discrimination can be both 
predicted and measured under conditions of mild anoxia. The particular 
form of the test (Hecht) is unsatisfactory due to the large proportion of 
men failing to show the anoxia effect, or failing to comprehend the instruc- 
tions” (27). 

The autokinetic illusion was studied in the laboratory with light 
stationary, light and/or subject moving, and in night formation flights. 
Autokinesis is universally experienced by normal persons; the delay in 
onset with a single light is short. Movement, in one direction, lasts about 
ten seconds. A single spot is seen to move about half the total fixation time. 
The illusion is only slightly subject to voluntary control. Increasing the 
frame of visual reference reduces but does not readily abolish the illusion, 
and it is reduced by more adequate spatial localization of object, by rapid 
relative movement of the target, and by shifts in attention. “The Autokinetic 
Illusion and Its Significance in Night Flying,” by Graybiel and Clark (10) 
reported these findings. 

An investigation of the role of vestibular nystagmus in the visual per- 
ception of a moving target in the dark by Graybiel, Clark, MacCorquodale, 
and Hupp (12) is an extension of the above study. Six subjects reported 
their visual perceptions both during and following rotation while observing 
a moving target in the dark and in a lighted room. When a subject was 
accelerated to 15 rpm in the dark, there was a rapid displacement of the 
target in the opposite direction, altho, at the same time, as a result of 
nystagmus, the target appeared motionless. Following cessation of rotation 
to the right at 15 rpm the target appeared to move very rapidly to the 
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left. Following cessation of rotation to the left, the target appeared to rush 
rapidly to the right while it was displaced to the right very slowly. 

These phenomena, which did not occur in a lighted room, can be con- 
sidered as a summation of the effects of real motion of the target, vestibular 
nystagmus, and the subject’s sensation of their own motion. These effects 
have important implications in the explanation of “vertigo” in pilots. 

An analysis was also made of the concept of aviator’s “vertigo,” based 
upon personal interviews with Naval aviators by Vinacke (48). He con- 
cluded that “the term ‘vertigo’ as used by aviators covers a wide variety 
of events occurring under many different conditions of flying. The term 
‘vertigo,’ as used by pilots, should be accepted as referring to any sensation, 
or feeling, which does not accord with observable environmental facts.” 

The oculogyral and oculogravic illusions were studied in flight using 
three subjects who observed a fixed luminous target in the dark. Observa- 
tions were made in the rear cockpit of a standard navy training plane. 
The subject gave a running account of the apparent motion and displace- 
ment of the target while the pilot maneuvered the plane thru different 
degrees of bank (3). 

Studies from the Pensacola laboratory have demonstrated several illu- 
sions of movement which may occur in flight. Three of these, the autokinetic 
illusions, the oculogyral illusion, and the oculogravic illusion were studied 
extensively. 

Vinacke (49) reported a detailed description of the types of illusions 
reported by a large number of pilots as occurring in aircraft. The illusions 
described by the aviators were categorized into five general types: visual. 
nonvisual, conflicting sensory cues, dissociational or recognitional, and 
general emotional. 

The speech intelligibility research program at Pensacola was initiated 
in 1942. Preliminary research indicated a need for more thoro analyses of 
the factors contributing to poor intelligibility of voice communications. 
One study of 200 instructors disclosed that only 14 percent had poor phona- 
tion (loudness, pitch, quality), in normal conversation, but 80 percent 
had poor phonation under simulated flight conditions. 

The researches on speech intelligibility covered the message, the talker. 
the transmission system, and the listener. Early in the program it was 
noted that certain words have a better acoustic penetration in noise. 
A study of vocabulary used by gunners in intercommunications procedure 
revealed that some words had less than 10 percent intelligibility value. 
Another observation revealed that long words had higher intelligibility 
value than short words (45). 

Speech technics have been developed to improve aerial voice communi- 
cations. Two types of transmission systems have been studied: (a) the 
Gosport (acoustical) and (b) the Radio (electrical). Intensive studies of 
the Gosport speaking type system led to modifications which improved 
intelligibility of voice transmission. Microphones, earphones, and oxygen 
masks have been studied, also. The speech laboratory has developed various 
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methods to check listening ability during flight. It found little relationship 
between audiometric examination results and listening ability in noise 
(22, 23, 45, 46). 

Steer, Lawrence, and others (24) reported an evaluation of the Gosport 
speaking tube. Flight and laboratory tests were conducted to evaluate 
the relative advantages of the old and new Gosport. An experimental feed- 
back system which allowed the instructor to hear himself talk to the student 
was also tested. 

The speech intelligibility training program was described in detail by 
Steer and Hadley (23). They also gave a bibliography of the research 
projects completed in the laboratory. 


Measurement of Proficiency (Criteria) 


The Washington group early recognized the lack of systematic treatment 
of the criterion-to-be-predicted problem. They concerned themselves with 
efforts to establish methods of collecting and recording criterion data, with 
the investigations of factors influencing the reliability and validity of the 
criterion and with the development of technics of analysis. Jenkins (17) 
in a recent article summarizes the thinking of the group on the more 
pertinent aspects of the problem. 

The principal criterion used for the validation of the various selection 
devices was outcome-of-training (the award of the “wings” or the dismissal 
from training). Outcome-of-training was further refined into reason-for- 
failure, such as ground-training failures, psychologically unsuited, dropped 
at own request, etc. These criteria naturally, were neither highly reliable 
nor valid for the prediction of combat pilot success (6, 14, 17). 

The first attempt to obtain combat criterion data was made by four 
naval psychologists who interviewed pilots with combat experience as they 
returned to the United States. Approaches considered and/or attempted 
were: (a) to determine what characteristics were important in meeting 
combat-requirements, (b) to obtain ratings or rankings of all members 
of an air group, (c) to use decorated versi's undecorated pilots, and (d) to 
use number of planes shot down. It was finally decided to attempt to 
identify men regarded by fellow pilots as either definitely wanted or 
definitely not wanted as a member of their combat team. 

A member of the Aviation Psychology Branch was sent to the Pacific 
area to develop basic methods of obtaining combat criterion data. The 
“high” nominations were sought by asking the respondent to name two 
men of his acquaintance (living or dead, regardless of rank) on whom 
he would most like to fly wing in combat. Nominations for the “low” group 
were obtained by asking him to name two men whom he would not like 
to have flying wing on him in combat (47). 

Further nominations were collected by one psychologist at a west coast 
port and by four psychologists in the Pacific area. Over 800 respondent pilots 
with approximately 1600 high and 1600 low nominations were contacted. 
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The free-response data were later categorized for coding. The original 
thirty-three unit-categories were reduced to twenty-six. These twenty-six 
categories when sorted formed five category-clusters (33, 34, 42, 43). 

From these categories two checklists were constructed, one for the high 
or “wanted” pilots and one for the low or “unwanted” pilots. The checklist 
method yielded 2872 respondents with a total of 4325 nominated pilots. 
Of these nominees, 2267 were nominated as highs, 1832 were nominated 
as lows, and 226 pilots were nominated for both high and low by different 
respondents. The fact that so few pilots received conflicting nominations 
is taken as evidence of the validity of the nomination technic (35, 36, 37). 

In a report of the Combat Criterion Project to date, Carroll (1) reported 
on the preliminary work, the technic for coding free-response materials 
into categories, the use of sociometric diagrammatic technics, experimental 
design, and nature of the population. 

An incidental investigation was made of the relationship of frequency 
of response to importance of response. The results indicated considerably 
less than a perfect correlation (43). This may have implications for 
future research. 

Trumbull and Vinacke (30), concerned with the problem of a criterion 
for the validation of flight instructor selection tests, used student evalu- 
ations of their flight instructors to establish criterion groups for analysis 
of selection data. The agreement among six criteria of success was deter- 
mined, and the 20 percent of instructors rated best and 20 percent rated 
low were isolated. The six criteria showed agreement. Using a composite 
of these six, the extremes of flight instructors were defined. 

Another approach to the criterion problem was made in the validity 
study of five targets for testing visual acuity thru the correlation of the 
test results of each target with the Grow Chart scores. In addition, the test- 
retest reliabilities of all six tests were studied. Acuity scores, obtained in 
Snellen equivalents, were translated into log-units to facilitate statistical 
analysis. Additional systems were assayed for scoring each of the Randolph 
Field tests (26). 

Estimates of the reliability of the Verhoeff test of depth perception were 
computed in a test-retest study (25). Four scoring methods were studied 
for their relative reliability and discrimination between levels of depth 
perception. 


Attitudes, Morale, and Leadership 


The Corpus Christi group became interested in basic emotional and 
social problems. One product was a discussion of the psychology of fear 
with emphasis on how to counteract it. A survey of attitudes and informa- 
tion regarding the war was made. The problem of leadership and organiza- 
tion in patrol plane crews was brought out for examination and treat- 
ment (8). 

A preliminary questionnaire study was made of the feasibility of using 
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the nominating technic at preflight schools for evaluating leadership and 
associated qualities of aviation cadets and student aviation pilots. Two 
questions were asked: “What two men in your present platoon would you 
select as leaders for the new one?” “What two men in your present platoon 
would you least desire as leaders of the new platoon?” (31). 


Tabulating and Analysis Technics 


Much of the work of the naval aviation psychologists consisted of the 
development of technics of analysis. Unfortunately little of this work has 
been put into written form. 


At Pensacola, Graybiel, Clark, and MacCorquodale (11) reported a 
method for observing and reporting the effect of angular acceleration and 


es. 


variations in “g” on visual perception during flight. The visual stimulus 
was a collimated “star” installed in the rear cockpit of a standard navy 
training plane. All observations were made in complete darkness. Both 
the pilot’s and observer’s verbal reports were dictated into an airborne 
wire recorder which also provided a time limit. These recordings were 
transcribed in the laboratory, and all analyses made from them. Fiske and 
Dunlap (9) presented a graphical test for the significance of differences 
between frequencies from different samples. 
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CHAPTER Ill 
Psychological Research in the AAF Aviation 
Psychology Program 


FREDERICK B. DAVIS 


Tue reports of psychological research reviewed in this chapter were 
written by military and civilian personnel of the Army Air Force Aviation 
Psychology Program. These reports are all in published form; originally, 
the reviewer had hoped to include unpublished research reports (of which 
hundreds are on file), but several considerations made this inadvisable. 
In the first place, the nineteen AAF Aviation Psychology Program Research 
Reports which were listed by Flanagan (27) in¢lude most of the important 
research findings that were presented in the unpublished documents and 
that are not subject to restrictions for security purposes. In the second 
place, the task of reviewing the unpublished materials and assigning credit 
for the research reported in them proved to be prohibitive. 

In addition to the officially approved reports and articles reviewed, this 
chapter includes a few others that present results of research conducted in 
the AAF Aviation Psychology Program. 

Two articles that were written by personnel of the AAF Aviation Psy- 
chology Program about aviation psychology in enemy countries indicated 
clearly that in this field the air forces of the United States and its allies 
were far ahead of the German and Japanese air forces. Fitts (22), who 
served as official representative of the AAF Aviation Psychology Program 
on a mission to Germany for the purpose of studying the technics and 
procedures used by German Air Force psychologists during the war, re- 
ported that concepts of objectivity, standardization, reliability, and validity 
were almost completely disregarded by the German psychologists. So far 
as could be determined, no contributions to technic were made that would 
be of value to American psychologists. Geldard and Harris (35) visited 
Japan in November and December of 1945 to assess the work of psychol- 
ogists in the Japanese Air Forces. They found that both the Japanese Army 
and Navy Air Forces used batteries of paper-and-pencil tests and psy- 
chomotor tests to select men for pilot training. In general, it is interesting 
to note that, so far as aviation psychology is concerned, Japanese psychol- 
ogists seemed to be far more advanced than their German counterparts. 

It is not generally known how considerable were the contributions of 
the AAF Aviation Psychology Program to air-crew selection procedures 
employed by the Royal Air Force, the Royal Canadian Air Force, the 
Australian Air Force, and the South African Air Force. Special air-crew 
classification batteries were actually designed for use in the French, 
Chinese, and Philippine Air Forces. Lyerly (70) has discussed the prepa- 
ration and use of these batteries in some detail. 
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Organization and Development of the 
AAF Aviation Psychology Program 


In Report No. 1 of the AAF Aviation Psychology Program Research 
Reports, Flanagan (25) described the development of the program, its 
main findings and accomplishments, and their implications for psychology 
and education. Following a brief introduction (25, Chapter 1), he pre- 
sented the historical background (25, Chapter 2) essential for an under- 
standing of the program and quoted official directives concerning it (25, 
Chapter 3). The objectives of the AAF Aviation Psychology Program were 
stated in 1943 and again in 1945 in articles in the Psychological Bulletin 
(81, 82). The organization and personnel of various research units were 
also discussed in these articles. Thorndike (101) has summed up the 
psychological research work in the AAF Aviation Psychology Program 
under two headings: first, the development and validation of tests for use 
in selecting and classifying air-crew personnel; and second, the solution of 
problems required to maximize the combat efficiency of personnel. 

Activities of psychologists in the AAF Training Command and some of 
the results of their work were described in an article prepared by the staff 
of the Psychological Section, Headquarters, AAF Training Command (89). 
DuBois (18, Chapter 2) outlined the location and functions of the psycho- 
logical units in the Training Command and Gilmer and Preston (37, 
Chapter 8) mentioned some of the administrative problems encountered 
in their operation. Simon and Berwick (96, Chapter 16) provided informa- 
tion concerning the special services performed during the war by the 
Statistical Unit of the Psychological Branch in the Headquarters of the 
Training Command. 

In addition to units in the continental United States, several detachments 
of psychologists were sent overseas for temporary duty. The histories and 
objectives of these detachments and of other missions undertaken abroad 
by members of the AAF Aviation Psychology Program were summarized 
by Lepley (66, Chapters 1, 2). In general, the detachments obtained combat 
validation data for test scores, made analyses of combat requirements, 
studied the aptitudes required of lead-crew personnel, and developed 
proficiency measures for air-crew specialties. 


In the first published account of work in the AAF Aviation Psychology 
Program, Flanagan (28) reported the initial steps in developing a test for 
selecting air-crew members in the AAF. This test, first called the Aviation 
Cadet Qualifying Examination and later the AAF Qualifying Examination, 
was further described in subsequent publications (25, Chapter 4; 80), 
and particularly in a volume edited by Davis (14). The latter traced the 
development of the AAF Qualifying Examination over a period of four 
years (14, Chapter 1), described the research work underlying its develop- 
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ment (14, Chapter 3), and made a general evaluation of its usefulness to 
the Army Air Forces (14, Chapter 12). The principles employed in con- 
structing this Qualifying Examination were set forth in some detail (14, 
Chapter 2) and should prove of interest to technicians confronted with 
the problem of assigning individuals to “accepted” or “rejected” groups 
without regard to individual differences within each group so obtained. 
The Qualifying Examination was constructed to serve a particular purpose, 
tho it found many uses (14, Chapter 4). 

In seven successive chapters of AAF Aviation Psychology Research Re- 
port No. 6, Davis (14) reported research on many kinds of test items tried 
out for use in the Qualifying Examination. Of three types of verbal items, 
reading-comprehension items were most useful for predicting graduation 
or elimination from pilot training in the AAF (14, Chapter 5). Factorial 
studies suggested that word knowledge and reasoning in reading are two 
important skills involved in reading. This result agreed with prewar studies 
by Davis. Successful efforts to develop tests of factual information that 
measure interests significantly related to graduation or elimination from 
pilot training in the AAF were described (14, Chapter 6). The technics 
employed should prove applicable to the construction of tests for educa- 
tional and vocational guidance. 

Objective test items that measure judgment and reasoning were found 
by Davis (14, Chapter 7) to be factorially complex. Several reasoning 
factors were identified, one of which was significantly related to gradua- 
tion or elimination from pilot training in the AAF. A mental skill believed 
to be peculiar to what is known as “judgment” was determined and named 
“evocation,” the ability to call relevant information to mind. 

The most useful items for predicting performance in pilot training were 
said by Davis (14, Chapter 8) to be mechanical-comprehension items. 
An investigation revealed that their variance could be accounted for almost 
entirely by four independent factors. The design of the factorial study was 
novel and should be of interest to students of factorial analysis. The use- 
fulness of twenty types of machine-scorable perceptual-test items for pre- 
dicting graduation or elimination from pilot training in the AAF was 
discussed by Davis (14, Chapter 9), and methods for ascertaining their 
efficiency in combination were outlined. Other types of items for which 
validity data were reported by Davis (14, Chapter 10) included mathe- 
matics items, interpretation-of-data items, and printed psychomotor items. 
The latter were especially recommended for additional research. 

The Victory Corps Aeronautics Aptitude Test which was widely dis- 
tributed by the U. S. Office of Education was constructed under Davis’s 
supervision and was described by him (14, Chapter 11). General pro- 
cedures for devising and refining aptitude-test forms were discussed by 
Thorndike (101, Chapter 3). 

The psychological research on the selection and training of bombardiers 
in the AAF that was accomplished prior to the establishment of Psycho- 
logical Research Unit (Bombardier) was summarized by Johnson (55, 
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Chapter 2). Research on the selection of instructors for bombardier schools 
was reported by Larson (64, Chapter 7), including data that indicated 
substantial validity for the Instructor-Selection Stanine. Melton presented 
(73, Chapter 23) information concerning the validation of eight apparatus 
tests against criteria of performance in bombardier training. 

McClelland and Dailey discussed the correlations of twenty-two scores 
derived from the Air-Crew Classification Battery with five criteria of pro- 
ficiency in flight-engineer training (71, Chapter 5). Intercorrelations of 
a number of tests constructed especially for selecting flight engineers and 
tests in the Air-Crew Classification Battery were also reported. A com- 
plete report on the problem of selecting flight engineers was made in the 
volume edited by Dailey (13), including a summary of the research up 
to 1946 and suggestions for future work in the field (13, Chapter 6). 

Six phases of the problem of selecting gunners were described by 
Stolurow and Schrader (98, Chapter 6). Difficulties in obtaining satis- 
factory criterion variables and practical limitations that prevented the 
elimination of more than a small proportion of trainees were major handi- 
caps. Schrader, Pascal, and Valentine reported the development of a 
selection test for gunnery officers, which showed a significant positive cor- 
relation with performance in the Combat Gunnery Officers Course (93, 
Chapter 13). The selection and training of instructors in schools for 
flexible gunners were discussed by Stolurow, Irion, and Pascal (98, 
Chapter 12). After consideration of a number of possible criteria, gun- 
camera scores were chosen for use in experimental studies reported by 
Melton (73, Chapter 21) on the selection of flexible gunners. 

Tests used to select men for navigator training and the research data 
pertaining to them were described by Carter and Michael (6, Chapter 3). 
The instruments devised to predict performance as an instructor in navi- 
gation schools were considered by Zielonka, Rust, and Rosemark (114), 
together with data regarding their effectiveness in measuring specified 
criterion variables. 

A series of studies relating to the selection and evaluation of instructors 
in pilot-training courses in the AAF were reported by Galt and Grier 
(34, Chapter 14). Work on the prediction of performance in pilot training 
is reviewed in this chapter in connection with the Aviation Cadet Qualifying 
Examination and the Air-Crew Classification Battery. 

The history of research work on the selection of radar observers was 
written by Kunsman (61) and the validation of selection tests for radar- 
observer training courses was discussed by Kelley (57, Chapter 11). 
Multiple correlations (subject to shrinkage) of .36 to .50 with criteria 
consisting of course grades were obtained. Apparatus tests administered 
at Langley field showed, according to Melton (73, Chapter 22), no signifi- 
cant correlations with any one of four criteria of success in radar training. 
Intercorrelations of the tests, obtained at Carlsbad Army Air Field, 
were low. 

Mollenkopf and Chaplin reported the design, construction, and use of 
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tests for selecting instructors in the AAF Personnel Distribution Com- 
mand (77, Chapter 2). Several weighted composite scores (stanines) were 
derived from these tests. Descriptions of motion-picture tests constructed 
for aptitude measurement by the Psychological Test Film Unit were 
presented by Lamkin, Schafer, and Gagne (63, Chapter 5). The tests gen- 
erally displayed low positive correlations with graduation or elimination 
from pilot training and contributed so little to the prediction of that 
criterion that the expense of using them for practical purposes could 
not be justified. 

The most rigorous study of the prediction efficiency of the procedures 
used in the AAF Aviation Psychology Program for selecting men for pilot 
training was designed by Flanagan (26). The study was unique and 
should prove invaluable to students of mental measurement. Thorndike 
reported the detailed results of the study (100, Chapter 5), which was 
based on the records of a large sample of applicants for pilot training 
who were admitted to training regardless of their scores on the Aviation 
Cadet Qualifying Examination and the Air-Crew Classification Battery. 
Case studies were made of sixteen men who obtained low scores on the 
selection tests and yet succeeded in completing pilot training and on 
fifteen men who obtained high scores and failed in pilot training. Walton 
presented two of these case studies as illustrations (105, Appendix C). 


Classification 


As explained by Flanagan (25, Chapter 4) after initial selection by 
means of the Aviation Cadet Qualifying Examination, men accepted for 
air-crew training were classified for specialized training as pilots, bombar- 
diers, navigators, gunners, etc. Flanagan outlined the essentials of the 
classification problem and mentioned the efficiency in the utilization of 
personnel that can be secured by differential classification. In a volume 
edited by DuBois (18) the classification program in the AAF was explained 
in detail. DuBois (18, Chapter 1) recounted the history of and plans for 
the classification testing of aviation students; in collaboration with 
Preston, he described the composition of the air-crew classification batteries 
and certain statistical data derived from their use (18, Chapter 3). Exten- 
sive data concerning the validity of stanine scores derived from successive 
classification batteries were reported by DuBois, Preston, and Peltier (18, 
Chapter 4). A description of group testing in AAF classification centers 
and a discussion of the standardization of testing procedures were presented 
by Gilmer and Preston (37, Chapter 2; 37, Chapter 1). The authors like- 
wise described the personal interviews with aviation students and the 
criteria used in recommending them for types of air-crew training (37, 
Chapter 7). 

Articles concerning the personnel and organization of Psychological 
Research Units 1, 2, and 3 appeared in the Psychological Bulletin (86, 
87, 88), Research activities of the units were also described briefly. 
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The development of tests for air-crew classification has been summarized 
in volumes edited by Guilford and Lacey (43) and by Melton (73). Tho 
these tests were designed for use in the classification battery, the criterion 
for judging their value was their contribution to the prediction of per- 
formance in one or more air-crew specialties, as pointed out by Humphreys 
(52, Chapter 2) in a discussion of the program of printed-test develop- 
ment. It is reasonable to suppose that quite different judgments of value 
would have been made had the criterion for judging value been a test’s 
contribution to predicting only that part of an air-crew specialty not present 
in other specialties for which performance was to be predicted. Yet it is 
this type of differential prediction that is the crux of the classification 
problem. In practice, therefore, the Air-Crew Classification Battery served 
as a multiple selection test among men initially selected by means of the 
Aviation Cadet Qualifying Examination. 

Following an introduction to tests of intellect and information prepared 
by Humphreys (52, Chapter 4), Mock described tests of verbal ability 
(76, Chapter 5), Davis presented data concerning mechanical tests (16, 
Chapter 13) and mathematics tests (16, Chapter 6), and Fruchter reported 
the findings regarding a trait called judgment (31, Chapter 8) and the 
development of information tests (31, Chapter 14). Lacey and Tait 
reviewed research work on reasoning tests that were not incorporated in 
the Air-Crew Classification Battery (62, Chapter 7) and Zimmerman pre- 
sented data concerning tests of visualization and offered hypotheses regard- 
ing the mental traits measured by these tests (115, Chapter 12). The 
construction of measures of foresight and planning, and data pertaining 
to their factorial composition, were discussed by Guilford and Mock (43, 
Chapter 9). These authors also reported the development of tests of 
integration, the latter being defined as the ability to pay attention to 
several variables simultaneously and to respond to a combination of them 
(43, Chapter 10). Research on memory tests was reviewed by Lipman, 
Patterson, and Shirley (67, Chapter 11). Evidence of the existence of 
three independent factors thought to represent aspects of memory ability 
was adduced. 

The outline of plans for constructing perceptual tests was provided by 
Lacey (62, Chapter 15). Zimmerman discussed the development and 
factorial composition of perceptual speed tests (115, Chapter 16) while 
Lacey described the printed tests of form perception developed for possible 
use in the Air-Crew Classification Battery (62, Chapter 17). The nature 
of eleven tests of size and distance was considered by Lacey and Shirley 
(62, Chapter 18) and their value as tests of pilot aptitude was mentioned. 
Lacey and Niehaus (62, Chapter 20) reported efforts to measure the 
ability to determine one’s location relative to landmarks, while tests 
designed to measure other spatial abilities were discussed and their 
factorial content hypothesized by Howe and Zimmerman (51). Fruchter 
(31, Chapter 21) described experimentally developed measures of set and 
attention. 
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The general approach to the problem of organizing and presenting 
material for testing emotion, temperament, and personality was outlined 
by Guilford (43, Chapter 22). According to Cerf (8, Chapter 23), per- 
sonality inventories and questionnaires that were commercially available 
in 1942-1945 failed to yield scores significantly related to performance in 
pilot training in the AAF. Furthermore, Cerf concluded (8, Chapter 24) 
that predictions of such performance made by clinicians on the basis of 
sets of test scores and subjective judgment were of little or no value. A 
description of the biographical data blank adapted by the AAF from the 
form used by the Civil Aeronautics Administration and the Navy Bureau 
of Medicine and Surgery was provided by Mock (76, Chapter 27), who 
presented evidence of its value. Measures of specific traits of temperament 
that were developed or tried out in the AAF Aviation Psychology Program 
were discussed by Davis (16, Chapter 25). Grossman presented data (41, 
Chapter 26) concerning tests of motivation. 

One of the most interesting fields of investigation of the AAF Aviation 
Psychology Program was that of mass testing with apparatus tests. The 
history of the development of these tests was recounted by Melton (73, 
Chapter 1), who has discussed the problems arising in the course of the 
unprecedented use of apparatus tests and the technics devised to cope with 
these problems (73, Chapter 2). Melton has summarized (73, Chapter 25) 
the conclusions reached on the basis of over four years of intensive research. 
He has also discussed technical considerations, such as methods of deter- 
mining reliability coefficients for apparatus tests and of obtaining suitable 
criteria for validating them (73, Chapter 3). The mechanics of testing 
large numbers of aviation students with psychomotor apparatus were 
explained by Gilmer and Preston (37, Chapter 3). 

Among the standard classification-battery tests for which Melton has 
provided detailed specifications and elaborate data concerning their 
reliability and validity were the SAM Complex Coordination Test (73, 
Chapter 4), the SAM Two-Hand Coordination Test and the SAM Two-Hand 
Pursuit Test (73, Chapter 5), the SAM Discrimination Reaction Time Test 
(73, Chapter 6), the SAM Rotary Pursuit Test and the SAM Rotary Pursuit 
Test With Divided Attention (73, Chapter 7), the Rudder Control Test 
(73, Chapter 8), the Santa Ana Finger Dexterity Test (73, Chapter 9), 
six tests of steadiness designed to measure the effect of emotional stress (73, 
Chapter 10), and two Pedestal Sight Manipulation tests intended to 
select men for training as B-29 gunners (73, Chapter 11). It was found 
experimentally that the psychomotor tests in combination made a signifi- 
cant contribution to the prediction of such criteria as performance in pilot 
training obtained from the use of paper-and-pencil tests alone. One of the 
questions left unanswered by research completed during the war was 
whether paper-and-pencil tests could be developed to the point where the 
unique contribution of apparatus tests would be too small io warrant the 
expense of developing and administering them. 

In addition to the apparatus tests actually employed in the Air-Crew 
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Classification Battery, Melton has listed many others that were still in the 
experimental stage at the end of the war. He has presented as much data 
about these as can be released under security restrictions. The tests in- 
cluded six designed to measure compensatory visual-motor reactions (73, 
Chapter 12), six that measure visual-motor pursuit skills (73, Chapter 13) , 
four path-tracing tests together with variations of three of these (73, Chap- 
ter 14), several coordination tests (73, Chapter 15), and nine visual 
discrimination-reaction tests (73, Chapter 16). Others were seven timing. 
reaction tests (73, Chapter 17), twelve manipulation and motility tests 
designed to aid in the selection of bombardiers and radar operators (73, 
Chapter 18), eight stress tests, one of which (the Falling Hammer) was 
validated against combat criteria by a detachment in England under the 
leadership of Lieutenant Colonel Paul Horst (73, Chapter 19), a large 
number of psychophysiological measures developed and studied extensively 
by M. A. Wenger (73, Chapter 19), and eight miscellaneous tests including 
measures of kinesthetic discrimination, foresight and planning, muscular 
coordination, sway compensation, and stability of orientation, as well as 
the AAF Physical Fitness Test, the SAM Control Sequence Memory Test, 
and the Minnesota Assembly Test (73, Chapter 20). 

Because of the need for placing air crews of the highest quality in lead 
planes, considerable research was undertaken to measure the abilities 
required of men in lead planes. This was summarized by Lepley (66, 
Chapter 9). 


Training 


Research on various aspects of training in the AAF was presented by 
Flanagan (25, Chapter 6). He discussed the content of training courses, 
the amount and rate of learning that took place, and the evaluation of 
training devices. The selection of instructors was also considered. Thorn- 
dike mentioned some of the problems of training experiments (101, 
Chapter 10). 

Most of the research work on training problems was undertaken by the 
AAF Aviation Psychological Research Projects at Training Command 
installations. An account of the history, organization, and _ research 
activities of the Psychological Research Project (Bombardier) was pre- 
sented briefly in the Psychological Bulletin (83) and in considerable 
detail in the volume edited by Kemp and Johnson (58). The latter wrote 
a brief background history of the training of student bombardiers and 
of instructors for bombardier training schools (55, Chapter 1), calling 
attention to the fact that over 47,000 bombardiers were trained in the 
AAF between the attack on Pearl Harbor in 1941 and the surrender of 
Japan in 1945. He also outlined the organization and mission of Psycho- 
logical Research Project (Bombardier) of which he was Assistant Director 
(55, Chapter 3). Kemp and Helmick reported an experimental study 
designed to show the improvement in circular error resulting from in- 
creasing the number of bombs dropped during the bombardier training 
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course (58, Chapter 8). Johnson (55, Chapter 10) summarized the work 
of the Psychological Research Project (Bombardier) and together with 
Kemp offered suggestions for future research in aviation psychology (58, 
Chapter 11). 

Psychological research concerning the selection and training of flight 
engineers began at Psychological Research Unit No. 2; later it was centered 
in the Psychological Research Project (Flight Engineer) at Hondo, Texas. 
The research projects undertaken and the trends that influenced their 
choice were outlined by French, McClelland, and Dailey (29). According 
to McClelland, Canfield, and Dailey, flight-engineer training was begun in 
April 1943 when excessive losses of bombardment aircraft on long over- 
water flights demonstrated the need for an air-crew member trained to 
operate engines at optimal power settings (71, Chapter 5). 

The psychological research carried out on flexible-gunnery training was 
reported in AAF Aviation Psychology Program Research Report No. 11, 
edited by Hobbs (50). He has written a brief history of the training of 
flexible gunners (50, Chapter 1), has pointed out the role of psychologists 
in the training program (50, Chapter 4), and has made a critical evaluation 
of the contributions of psychological research to gunnery training (50, 
Chapter 15). With Schrader (50, Chapter 11), he explained how psychol- 
ogists prepared curriculums, lesson plans, manuals, etc., for the training 
courses, formulated principles of program planning, and systematically 
evaluated the training programs. A description of the typical gunner in 
the AAF was written by Pascal (79, Chapter 3); the gunner was said to 
be about twenty-three years of age, a high-school graduate, and about 
half a standard deviation above average in mental ability. His motivation 
during training was not good. A description of several training devices 
used in flexible-gunnery training was given by Vallance and Schrader 
together with evaluative information pertaining to them (103, Chapter 9). 

The establishment of the Psychological Research Project (Navigator) 
in the AAF Training Command was described by Carter (6, Chapter 4) 
and a list of the personnel attached to it was presented (84). A complete 
account of the work of the project was made available in the research 
report edited by Carter (6), who also prepared a summary of psychological 
research in navigator training with suggestions for future planning and 
research (6, Chapter 13). Michael outlined the role of the navigator in 
the AAF, the selection of men for navigator training, and research in the 
problems of navigator training (74, Chapter 1). Suggestions regarding 
the length and arrangement of the content of the course in navigation 
resulted from a study of dead-reckoning navigation that was made by 
Dudek (19, Chapter 8). With Glaser, Dudek also reported the nature and 
results of a rigorous evaluation of a special training aid used in navigation 
training in the AAF-trainer, the so-called G-trainer. Important methodo- 
logical implications may be derived from this study. 

In AAF Aviation Psychology Program Research Report No. 8, edited 
by Miller (75), psychological research was reported on objective measures 


551 





REviEw OF EDUCATIONAL RESEARCH Vol. XVIII, No. 6 





of flying skill, printed tests of flying information, subjective measures of 
flying proficiency, job analysis, and instructor selection and evaluation 
(75, Chapter 1). Prior to this, the functions, history, and personnel of 
the Psychological Research Project (Pilot) had been listed briefly and its 
research activities discussed at some length (85). Ericksen outlined the 
organization of the AAF Training Command and briefly described its 
functions (20, Chapter 2). Two controlled experiments in the training of 
pilots were summarized by Galt (34, Chapter 13). The results indicated 
that the use of twin-engine airplanes in basic pilot training improves per- 
formance on twin-engine airplanes in advanced training and that the 
use of optical sights on shotguns used for skeet training is desirable. The 
effect of adding five weeks of training to the normal courses in pilot 
training in the AAF was studied and the procedures and results were 
reported by Miller, Galt, and Gershenson (75, Chapter 10). A summary 
of the work of Psychological Research Project (Pilot) and recommenda. 
tions for further work in the field were provided by Miller (75, Chapter 15). 

Psychological research on radar-observer training in the AAF was pre- 
sented in a volume edited by Cook (10). The problems encountered in 
selecting and training radar observers were mentioned and the procedures 
employed to solve them were discussed. Cook compared the use of batteries 
of tests of relatively uncorrelated mental traits with the use of batteries of 
work-sample tests (10, Chapter 12). As would be expected when two 
batteries of tests measured essentially the same mental skills in two different 
combinations, both batteries turned out to provide approximately equal 
accuracy of prediction; in such a situation, the differences in intercorre- 
lations of the parts of the two batteries could have no appreciable effect 
on their accuracy of prediction of a single criterion. Hastorf (48, 
Chapter 1) defined the scope of AAF Aviation Psychology Program 
Research Report No. 12 and outlined the essential principles of radar, its 
adaptation to airborne use in combat operations, and the training program 
for radar observers in the AAF (48, Chapter 2). He also wrote (48, Chap- 
ter 3) a brief summary of research on the selection and training of radar 
observers accomplished under the auspices of the National Defense Re- 
search Committee and by Psychological Research Project (Radar). 

Studies of the acquisition and retention of air-crew skills were reviewed 
and some data pertaining to the retention of these skills during periods 
of inactivity were presented by Crawford, Sollenberger, Ward, Brown. 
and Ghiselli (12, Chapter 12). The instructional technics peculiar to the 
use of motion pictures, together with data regarding their effectiveness as 
teaching devices, were discussed by Gibson, Borin, Orvis, and Gagne (36, 
Chapter 10). 


Measurement of Proficiency and Criterion Studies 


Studies of the proficiency of bombardiers, flight engineers, flexible 
gunners, navigators, pilots, and radar observers were summarized by 
Flanagan (25, Chapter 5). Crawford, Sollenberger, Ward, Brown, and 
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Chiselli edited Report No. 16 in the series of AAF Aviation Psychology 
Program Research Reports, which included data regarding the analysis 
of duties, the criteria of proficiency used for validation purposes, and the 
validity data for a number of air-crew positions (12). Their introduction to 
the report (12, Chapter 1) indicated its scope and purpose. That the most 
fundamental and, in many respects, the most difficult problem faced in the 
AAF Aviation Psychology Program was the definition and measurement 
of satisfactory criterion variables was pointed out by Thorndike (101, 
Chapter 4). Ultimate criteria were formulated but were rarely measureable. 
Intermediate or even immediate criteria were therefore used and supple- 
mented with professional judgment. This is an excellent discussion of an 
important methodological issue. Efforts were made to maximize the rele- 
vance of available criteria and to minimize bias in them; of secondary 
consequence were efforts to maximize the reliability of criterion variables. 

Kemp discussed the development of phase checks to serve as criteria for 
validating tests used in the selection of bombardiers (58, Chapter 4). Pro- 
ficiency tests were constructed to provide measures of the practical knowl- 
edge about bombing and navigation required of bombardier students. 
These tests were described by Johnson (55, Chapter 5) and sample items 
were presented. An evaluation of various measures of proficiency for use 
in bombardier training was reported by Crawford, Sollenberger, Ward, 
Brown, and Ghiselli (12, Chapter 7). Johnson described (55, Chapter 6) 
surveys of the level of proficiency of aerial instructors and supervisory 
personnel at cadet bombardier schools and in the AAF Central Instructors 
School (Bombardier). Johnson (55, Chapter 9) also reported research on 
the development of a motion-picture test for target and check-point identi- 
fication, a study of the reliability of the circular error and of the percent 
of hits for C-1 autopilot bombing, and the results of several minor studies. 

Research work, designed to improve existing criteria for judging the 
performance of flight engineers and directed at the development of new 
criteria was described by Seaman, Unger, Dailey, and McClelland (94). 
The fact that Navigator Stanine scores have some promise for predicting 
performance in ground-school courses in operational training was indi- 
cated by Crawford, Sollenberger, Ward, Brown, and Ghiselli (12, Chap- 
ter 8) in studies of criteria for judging the proficiency of flight engineers. 

Stolurow stated that the measurement of proficiency among students at 
gunnery schools was at first ineffective (98, Chapter 7). Gradually, the 
situation was improved as well-constructed examinations became available 
and were uniformly administered and interpreted. Data concerning four 
forms of the Final Comprehensive Examination were presented. To meet 
the need for practical tests of proficiency in operating, caring for, and 
checking equipment, phase checks were developed, as described by Valen- 
tine (102, Chapter 8). A study made by Johnson and Milton (56, Chapter 
18) showed that a marked increase in accuracy of aiming a B-29 Pedestal 
Sight could be secured by redesigning the controls in the light of human 
capabilities and limitations. 
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As part of the task of establishing procedures for selecting lead crews, 
research reported by Crawford, Sollenberger, Ward, Brown, and Ghisel; 
(12, Chapter 10) was undertaken to provide analyses of proficiency 
measures and synthetic-trainer scores for flexible gunners in operational 
training. These editors also reported research on evaluating the proficiency 
of air-crew members to provide criteria for selecting lead crews (12, 
Chapter 11). 

Research related to the development of aerial measures of navigation 
skill (97, Chapter 6) and objective ground measures of navigation skil! 
(97, Chapter 5) was described by Smith. Data resulting from studies of 
the graduation-elimination criterion and of the grades given in navigation 
schools were reported by Michael and Rosemark (74, Chapter 7). Analysis 
by Dudek, Peltier, Smith, Lyon, and King of the procedures used to 
determine position by means of dead-reckoning navigation indicated the 
relative importance of each of these procedures and provided leads for 
improving the teaching of dead-reckoning navigation and for decreasing 
“distance-off” (19, Chapter 9). The duties of the navigator in operational 
training and criteria for judging his proficiency were presented by Craw- 
ford, Sollenberger, Ward, Brown, and Ghiselli (12, Chapter 6). 

Problems involved in measuring pilot proficiency were discussed }y 
Miller (75, Chapter 4) while Ben-Avi described the grades assigned to 
students during their flying training together with analyses and evalua- 
tions of them (2). Objective measures of flying skill that were developed 
for use in primary pilot training were presented by Youtz (113, Chapter 6) ; 
objective measures of single-engine instrument-flying skill were discussed 
and evaluated by Hagin (47, Chapter 9) while Ericksen described the 
nature and use of objective measures of’ multi-engine instrument-flying 
skill (20, Chapter 8). Four studies concerning the measurement of pilot 
skill in flying two-engine airplanes were also reported by Ericksen (20. 
Chapter 7). Fixed-gunnery scores as objective measures of flying skil! 
were evaluated in research studies summarized by Gleason (38, Chap- 
ter 11). The development and the use of printed tests of flying information 
were considered by Robbins and Levine (91). A series of studies on pro- 
ficiency measures and their validation were reported by Crawford, Sollen- 
berger, Ward, Brown, and Ghiselli concerning the fighter pilot (12. 
Chapter 2), the photo-reconnaissance pilot (12, Chapter 3), the co-pilot 
(12, Chapter 5), and the airplane commander (12, Chapter 4). Investiga- 
tions of fighter-pilot proficiency, the prediction of fighter-pilot combat 
proficiency, and fatigue factors in long-range fighter missions were sum- 
marized by Lepley (66, Chapter 10) from reports written by the investi- 
gator, Lieutenant Wilse B. Webb, an aviation psychologist attached to 
the 413th Fighter Group. Fitts reported the accuracy with which AAF pilots 
can reach objects placed around them when they are unable to see either 
the objects or their own bodies (23, Chapter 15). Accuracy is greatest 
reaching forward and below shoulder level. 


Graff, Kelley, and Hastorf discussed the development and content of five 
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printed tests for measuring the proficiency of students in radar training 
(39). The intercorrelations of several of these proficiency tests were 
presented by Kriedt, Johnston, and Kunsman (60), who pointed out the 
considerable amount of overlap indicated by the data. Six standardized 
performance tests, developed to supplement the measurement of radar- 
observer proficiency by means of paper-and-pencil tests were described 
by Bray (4, Chapter 6). Sources of unreliability in the performance-test 
scores were discussed (4, Chapter 7) and two concepts of validity were 
mentioned. The use, reliability, and relationships of the circular error in 
radar bombing with other measures of proficiency were reported by Klein 
(59, Chapter 9). It was concluded that thirty to thirty-five hours of training 
are insufficient to develop a high degree of skill in radar bombing. No 
satisfactory criteria for validating measures of proficiency for radar 
observers in operational training were found, according to Crawford, 
Sollenberger, Ward, Brown, and Ghiselli (12, Chapter 9). 

Gibson reported the construction and use of motion-picture tests for 
measuring proficiency in aircraft recognition and target identification 
(36, Chapter 6). In collaboration with Gagne, he presented experimental 
data on several aspects of aircraft recognition (33). Evaluations of the 
Renshaw system and of some alternative training procedures were made. 
Davis described the construction of several specialized examinations used 
by the AAF, including the Aviation Cadet Educational Examination, the 
Flight-Officer Examination, the AAF English Expression Test, and the 
Victory Corps Aeronautics Aptitude Test (14, Chapter 11). Experiments 
reported by Melton (73, Chapter 24) revealed significant impairment of 
proficiency on both paper-and-pencil and apparatus tests at 15,000 to 
18,000 feet without oxygen and at 45,000 feet with oxygen. Performance 
on an addition test was found especially sensitive to changes in altitude. 

In spite of innumerable difficulties, more than 1872 different indexes 
of the combat validity of the selection and classification tests used in the 
AAF were obtained. In a research report edited by Lepley (66), these data 
were reported and discussed. Two faults of criterion data were said by 
Lepley (66, Chapter 3) to be low reliability and bias. The criteria used 
included objective measures, administrative actions, direct and systematic 
observations, and ratings based on general impressions. Lepley described 
the use of proficiency tests for assembling lead crews and for detecting the 
need for precombat or refresher training (66, Chapter 8). The tests found 
to be most predictive for bombardier criteria of combat effectiveness were 
Spatial Orientation I and II, Mathematics A and B, Mechanical Principles, 
Discrimination Reaction Time, and the Pilot Stanine (66, Chapter 4). Of 
thirty-seven variables correlated with several measures of navigator effec- 
tiveness in combat, Lepley reported (66, Chapter 5) that sixteen had pre- 
dominantly positive correlations. The four best predictors were Technical 
Vocabulary (Navigator), Technical Vocabulary (Pilot), Arithmetic Rea- 
soning, and Mathematics B. The absolute magnitudes of the validity 
coefficients were not especially meaningful because of marked attenuation 
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resulting from the rigorous selection of navigators at classification centers 
on the basis of the Navigator Stanine. A total of 889 validation statistics, 
using the combat effectiveness of pilots as the criterion, were presented by 
Lepley (66, Chapter 6). Greatest effectiveness was found for predicting 
fighter-pilot performance. The most useful classification tests were Mechan- 
ical Principles, SAM Two-Hand Coordination, SAM Rotary Pursuit, Spatial 
Orientation I and II, Aiming Stress (portrayed on the stage in Winged 
Victory), and Table Reading. Criteria of success in combat different from 
those employed in the studies summarized by Lepley were utilized by 
Mollenkopf. He found no evidence of significant relationships between the 
criteria he used and various selection tests (77, Chapter 4). Lepley sum. 
marized the psychological research work done in various combat areas 


by sixteen officers and three enlisted men on temporary duty (66, 
Chapter 11). 


Studies of Requirements 


Studies of the requirements of air-crew positions were made at various 
times in the AAF Aviation Psychology Program and for many different 
purposes. Job requirements for the bombardier, navigator, and pilot were 
reported by Walton (106). Thorndike summarized (101, Chapter 2) the 
job-analysis procedures employed as a basis for test construction: (a) 
review of existing literature, (b) analysis of records of performance, 
(ec) interviews with air-crew personnel, (d) direct experience on the part 
of psychologists, and (e) correlation of tests and criteria. 

A description of tasks performed by students in flight-engineer training 
schools was presented by Schmonsees, Unger, Riecken, and McClelland 
(92), who also made a job analysis in terms of psychological traits. 
According to Valentine (102, Chapter 2), the task of the flexible gunner 
was ordinarily that of firing at a target (an attacking fighter plane) from 
a platform (a bomber) also moving in three dimensions. A discussion of 
the skills and abilities involved in gunnery was prepared by Irion (53, 
Chapter 5), who also considered the use of synthetic trainers as criterion 
measures. 

A job analysis of the navigator’s task and the attributes of a successful 
navigator were presented by Whiteside and Glaser (109). Youtz has pro- 
vided a convenient summary of the skills and abilities required of a pilot 
(113, Chapter 3, Part I) and Ericksen made an analysis of the pilot’s task 
in specialized types of activities, such as instrument flying, night flying. 
navigation, and formation flying (20, Chapter 3, Section II). 

Kelley reported a job analysis for radar observers (57, Chapter 4) made 
largely in terms of mental abilities defined by centroid factors to which 
names were ascribed on the basis of subjective judgment. Investigations of 
the combat requirements for air-crew personnel were summarized by 
Lepley (66, Chapter 7), and Flanagan discussed research on mission 
failures and on errors of personnel during operations in combat (25, 


Chapter 7). 
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Attitudes, Morale, and Leadership 


Research on attitudes, morale, and leadership was conducted by the 
AAF Aviation Psychology Program largely in the AAF Personnel Distribu- 
tion Command in redistribution stations or convalescent hospitals. Many 
of the studies made in these installations were summarized by Flanagan 
(25, Chapter 8), among them investigations of fear and courage in aerial 
combat, anxiety reactions, counseling and therapy, and attitudes and 
preferences of combat returnees. 

Psychological research on problems of redistribution was summarized 
in a report edited by Wickert (110). In this report, Wickert recounted 
the history of psychological research in AAF redistribution stations and 
listed the personnel engaged in it (110, Chapter 1); he also made an 
over-all evaluation of the work and mentioned the potential value of data 
that were gathered but not fully analyzed during the war (110, Chapter 8). 
Crannell and Mollenkopf outlined the extensive research conducted to 
determine the essentials of combat leadership (11). Methodological prob- 
lems of research in leadership were stressed and the instruments used 
were described. Studies conducted to ascertain the nature of anxiety re- 
actions in combat were reported by Shaffer (95, Chapter 5). The tests 
used to select air-crew personnel were found to be unrelated to the pres- 
ence or absence of anxiety reaction as determined by psychiatric examina- 
tion. A Personality Inventory was developed, however, which consistently 
showed biserial correlations of the order of .50 with the criterion. With 
Kamman, Lecznar, Pearson, and Williams, Shaffer discussed surveys of 
fear and courage in aerial combat, of the psychological causes of mission 
failures, and of disorientation in instrument flying (95, Chapter 6). The 
attitudes and preferences of AAF air-crew personnel returned to the con- 
tinental United States from combat zones were described by Shaffer and 
Pearson (95, Chapter 7). Differences among fighter pilots, bomber pilots, 
bombardiers, and navigators were pointed out. 

The attitudes and opinions of flexible gunners who had recently returned 
from combat, had graduated from a training school, or had not entered 
into combat were reported by Irion (53, Chapter 10). Some of the prob- 
lems encountered in training navigators who had been returned from 
combat and assigned to the AAF Instructors School for Navigators were 
outlined by Friedman, Rosemark, Heathers, Grigg, and Zielonka (30). 
A study of the attitudes of air-crew personnel (both officer and enlisted) 
returned from combat toward further duty assignments of various types 
was described by Crawford, Sollenberger, Ward, Brown, and Ghiselli 
(12, Chapter 13). 

Bijou (3) edited a volume of the AAF Aviation Psychology Program 
Research Reports concerning research in AAF convalescent hospitals. The 
need for and development of the psychological services and research work 
in these hospitals was stated by Bijou and Gillman (3, Chapter 1). The 
psychological services were described in detail by McNeill, Heathers, Rotter, 
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Willerman, and Lawrence (72), including evaluation procedures and 
individual and group counseling technics. Research activities were outlined 
by Bijou and Heathers (3). The tests and inventories that were used 
were mentioned and the criteria used to validate them were listed. These 
same authors also prepared a summary and evaluation of all the service 
activities and research work of psychologists in the AAF convalescent 
hospitals (3, Chapter 11). 

Data derived from the administration of five personality inventories 
were summarized by Heathers (49), who concluded that all five of them 
possessed substantial utility. Lawrence and Levine investigated attitudes 
of patients in AAF convalescent hospitals (65, Chapter 5) and Lawrence 
reported data suggesting that biographical information may be useful in 
making prognoses for convalescent patients (65, Chapter 6). Descriptions 
of a number of interest questionnaires used in AAF convalescent hospitals 
and results obtained from their use were presented by Lucio and Mc- 
Reynolds (69). 

In an effort to measure the impairment of mental efficiency associated 
with psychiatric disorders, the Shipley-Hartford Retreat Scale for measur- 
ing mental impairment and a new Efficiency of Mental Application Test 
were tried out in the convalescent hospitals. Bijou and Lucio discussed 
the findings, noting that both tests showed promise, particularly the test 
assembled especially for use in the AAF (3, Chapter 8). Three projective 
tests were used in the convalescent hospitals by psychologists in the AAF 
Aviation Psychology Program: the Rorschach Test, the Bender Visual- 
Motor Gestalt Test, and the Incomplete Sentences Test. Of the three, the 
last seemed to differentiate best between normal and maladjusted patients. 
These data were reported by Wischner, Rotter, and Gillman (112). An 
ingenious method of quantifying interpersonal behavior in group counsel- 
ing was described by Willerman and Pascal (111) and an illustration of 
its use was given. 

In an article published in the Psychological Bulletin (79), Super dis- 
cussed case studies and clinical evaluations of aviation cadets together 
with the projective technics employed. Tho most of the work of the AAF 
Aviation Psychology Program was concerned with mass testing by means 
of objective measures, elaborate studies of clinical procedures were made 
on samples of aviation students in order to assess their efficacy. In gen- 
eral, the data showed that clinical evaluations did not add anything to 
predictions of performance made solely on the basis of machine-scorable 
objective tests. 


Tabulating and Analysis Technics 


Elaborate safeguards were employed in test-scoring operations of the 
AAF Aviation Psychology Program to prevent and catch errors. The 
procedures used in scoring classification tests were described by Gilmer 
and Preston (37, Chapter 4). These authors likewise discussed the routine 
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checks and statistical technics employed to insure comparability in the 
classification-test scores derived from apparatus tests (37, Chapter 5). 

Because validation of test scores obtained at classification centers and 
psychological examining units against performance in training courses 
and in certain air-crew duties was the foundation stone of research carried 
on in the AAF Aviation Psychology Program, it was essential to have an 
accurate, complete, and convenient records system. Gilmer and Preston 
described the routine of handling records in psychological examining units 
(37, Chapter 6) while Simon and Berwick described the records system 
at the Headquarters of the AAF Training Command (96, Chapter 10), 
where test scores from many examining units were filed together. The basic 
records files (96, Chapter 11) and the training-data files (96, Chapter 12) 
maintained in the Psychological Section at Headquarters, AAF Training 
Command were also discussed by Simon and Berwick (96, Chapter 13). 
These authors mentioned everyday problems encountered in the collection 
and maintenance of machine records and made suggestions for avoiding 
them (96, Chapter 17). They discussed the types of errors common in 
machine-records operations and methods used to control them (96, Chap- 
ter 18). General considerations in the establishment and use of machine- 
records systems, with illustrations from their experience in the AAF 
Aviation Psychology Program, were presented by Simon and Berwick 
(96, Chapter 9). They also discussed the dissemination of data by means 
of roster, punched cards, and microfilms (96, Chapter 14). 

In AAF Aviation Psychology Program Research Report No. 3 edited 
by Thorndike (101), some of the technical problems encountered in 
psychological research work during the war were considered and the pro- 
cedures developed to meet them were summarized. Special attention was 
given to problems associated with the selection and classification of per- 
sonnel. To express validity coefficients, the product-movement r was used 
whenever possible. With dichotomized criteria, biserial rather than point- 
biserial r’s were computed in order to minimize the effect of variation 
in the position of the dichotomic line on validity coefficients obtained in 
different samples. Thorndike discussed these and other correlation statistics 
used in determining the validity of single tests (101, Chapter 5) and pre- 
sented the formulas used to correct for restriction of range due to prior 
selection. Procedures for obtaining composite aptitude scores were outlined 
by Thorndike (101, Chapter 6). The multiple-regression and multiple- 
cutoff methods were contrasted and the reasons for choosing the former 
for use in the AAF Aviation Psychology Program were mentioned. A for- 
mulation of the problem of a unique classification system was presented. 
Emphasis was given (101, Chapter 8) to the significance of the intercor- 
relations of a set of variables proposed for use for selection purposes. 
Three types of prediction problems were identified: selection, multiple 
selection, and classification. The importance of test reliability as an aid 
to interpreting test-validity data was stressed by Thorndike (101, Chap- 
ter 7) and various ways of computing reliability coefficients were men- 
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tioned. An analysis of the sources of variance in test scores was especially 
noteworthy (p. 102-103). Several formulas developed by A. P. Horst to 
determine the loss of test validity ascribable to extraneous variance in 
test scores were presented by Thorndike (101, Chapter 9). Methods used 
to minimize extraneous variance in test scores obtained in the AAF 
Aviation Psychology Program, especially in apparatus-test scores, were 
found to be highly effective. 

Most research workers will want to become familiar with parts of AAF 
Aviation Psychology Program Research Report No. 18, edited by Deemer 
(17). In this report Alchian has written four chapters on the methods of 
statistical analysis employed in the AAF Aviation Psychology Program 
that are notable for their presentation of up-to-date concepts in surprisingly 
compact and straightforward fashion. The basic principles of modern 
statistical analysis and inference were stated succinctly (1, Chapter 20) and 
were followed by detailed descriptions of the procedures used to estimate 
the parameters of univariate distributions (1, Chapter 21). The statistics 
employed in bivariate analyses were set forth with the tests of significance 
appropriate for use with them (1, Chapter 22) and technics of multi- 
variate analysis were described with special reference to regression sta- 
tistics (1, Chapter 23). 

This research report (No. 18) also includes two interesting chapters 
written by Simon and Berwick on machine technics. In one of these, 
detailed procedures for obtaining biserial correlation coefficients and 
intercorrelations were presented (96, Chapter 15), and in the other a 
method for obtaining the sums of squares and of products with the IBM 
alphabetical accounting machine was described (94). 

Statistical procedures commonly used in one or two psychological 
research units for computing reliability coefficients, item-analysis data, 
validity data, and factorial data regarding items and tests were outlined 
by Humphreys (52, Chapter 3). The type of internal-consistency and 
external-criterion item-analysis data used in the development of the 
Aviation Cadet Qualifying Examination and many other examinations was 
explained by Davis (14, Appendix A). Detailed instructions for computing 
the data as well as evidence of its reliability were provided. Item difficulty 
indexes were found to be more reliably determined than item-test correlation 
coefficients. Guilford discussed the factorial composition of a large number 
of the tests developed for use in classifying aviation students and related 
these data to the criteria that were to be predicted (43, Chapter 28). 


Design of Equipment 


The establishment of a Psychology Branch in the Aeromedical Labora- 
tory at Wright Field, as reported by Fitts (24), provided a central point 
for psychological research on the design of equipment. Previously, con- 
siderable work in this area had been accomplished in the AAF Aviation 
Psychology Program, but no organization had been specifically charged 
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with the responsibility. A report edited by Fitts (23) presented the research 
data accumulated on the design of equipment with regard to human capa- 
bilities and limitations. The nature of engineering psychology and its 
applications, methods, and technics were discussed by Fitts (23, Chapter 1). 
Problems associated with the means of presenting information obtained 
in the form of instrument readings were described by Grether (40, Chap- 
ter 2). Brown and Jenkins prepared an outline of research related to the 
design of equipment, which was based on an analysis of human motor 
abilities (5). A bibliography was appended. 

A number of studies have been made to determine how aircraft instru- 
ments and accessory materials should be designed to minimize errors in 
using them. Comparing the relative ease and accuracy with which tables 
and graphs were read, Carter concluded that tables are preferable as a 
means of presenting data if interpolation is not required. If it is, graphs 
are to be preferred (7, Chapter 4). The sources of error in reading air 
navigation plotters were identified and, according to Christensen (9), a 
new plotter has been designed that should prove considerably easier to 
use. Grether has shown that a twenty-four-hour dial face on a clock is 
easier to read than a twelve-hour dial face provided that time is to be 
read according to the twenty-four-hour system (40, Chapter 6). Optimum 
characteristics of a twenty-four-hour clock face were determined. Some 
findings with respect to dial faces are interesting; Grether and Williams 
discovered (40, Chapter 7) that the accuracy with which dials were read 
increased as their diameters were increased up to two inches. It also 
increased as gradations were increased to seven-tenths of an inch. On the 
other hand, speed of dial reading did not appear to be related to size of 
dial diameter or scale interval. A study of the interpretability of various 
types of aircraft attitude indicators, made by Loucks (68), showed that 
for blind flying the horizon should remain fixed and a three-dimensional 
miniature aircraft should constitute the moving element and should move 
in the direction in which the plane rolls. 

Another group of related studies pertained to airplane control knobs 
and their uses. Weitz (108) concluded that coding control knobs by color 
and shape helped reduce the difficulty normally experienced when a pilot 
shifts to an unfamiliar airplane in which the controls are placed differently. 
Experimentation with control knobs of various shapes has indicated, ac- 
cording to Jenkins (54, Chapter 14), that knobs of certain shapes are less 
frequently confused than others and should be standardized for use in 
aircraft cockpits. Data obtained by Grether (40, Chapter 17) showed that 
airplane controls can be handled more efficiently with the arms and hands 
than with the legs and feet. Fore-and-aft movements were found more 
efficient than lateral movements. If a group of controls must be adjusted 
rapidly in a certain sequence, Murray recommended (78) that they be 
operated in a similar direction. Clockwise movement of a rotary control 
of an indicator should, Carter and Murray found (7, Chapter 10), be 
associated with downward and left-to-right movement of the indicator. 
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In a different experiment, Warrick discovered that clockwise rotation of a 
control knob should be associated with movement of an indicator toward 
the operator and from right to left (107, Chapter 9). 

Mild anoxia (the condition resulting from lack of sufficient oxygen) 
seemed not to affect the number of illusions under experimental condi- 
tions reported by Grether, Cowles, and Jones (40, Chapter 19). The errors 
made by a pilot in reading instrument dials tends to increase in the 
presence of moderate G force, as indicated by Warrick, Nelson, and Lund 
(107, Chapter 20). 

On the basis of an investigation of ability to reproduce pressures, Jenkins 
concluded (54, Chapter 12) that a wide range of pressures from five pounds 
up to thirty or forty pounds should be required in the operation of air- 
plane controls. Pressures greater or less than those limits seem to be mor: 
difficult to reproduce accurately. According to Van Saun (104), for radar 
operators the polar-grid sector scope was superior to the cartesian-grid 
sector scope. Both scopes were more readily interpreted when the PP! 
scope and the sector scopes had the same orientation. 

Important contributions were made by psychologists to the design of 
equipment for flexible gunners and tc the technics used in sighting and 
aiming. Some of these have been mentioned previously in this chapter: 
others were discussed by Vallance (103, Chapter 14). 


Motion-Picture Testing and Research 


A complete report of the work on motion-picture testing and research 
in the AAF Aviation Psychology Program was provided in the volume 
edited by Gibson (36). Some special aspects of the work have been men- 
tioned previously in this chapter in connection with topics to which they are 
relevant; other aspects of the work will be summarized in this section. 

The history, functions, and personnel of the Psychological Test Film 
Unit were first published in the Psychological Bulletin together with the 
hypotheses to be tested, research work under way, and test-construction 
technics employed (90). Gibson wrote more fully on these topics (36, 
Chapter 1). The peculiar characteristics of motion-picture tests and the 
unique possibilities of their application to psychological testing were dis- 
cussed by Gagne, Bornemeier, Gibson, and Borin (32). Many practical 
problems in constructing and producing motion-picture tests were reported 
by Gibson, Bornemeier, Eisenberg, and Slater (36, Chapter 3). Some of 
the problems confronted in the presentation of motion-picture tests were 
mentioned by Finney and Gibson (21). Experimental evidence of the 
effects of varied amounts of illumination and of seating position on the 
perception of motion pictures was obtained. More theoretical were Gibson's 
discussion (36, Chapter 8) of the differences between the perception of 
pictures and the perception of visual realities and Gibson and Glaser’s 
formulation (36, Chapter 9) of a systematic theory to account for observed 
data regarding individual ability in monocular space perception. Further 
research in the perception of space is needed, according to the authors. 
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Implications of Psychological Research in the 
AAF Aviation Psychology Program 


It has been impossible to include in this chapter all of the articles con- 
cerning the applications to psychological and educational research of work 
done in the AAF Aviation Psychology Program, but a great many refer- 
ences have been reviewed. 

Flanagan discussed general contributions of the AAF Aviation Psy- 
chology Program to the theory and knowledge of individual differences 
and trait differences (25, Chapter 9). Considerations pertaining to the 
trait theory of human abilities, the measurement of traits, the significance 
of motivation, and the nature and significance of personality factors were 
taken up. Of special interest to educators was Flanagan’s statement of the 
implications of research in aviation psychology regarding the nature and 
principles of learning, the relative importance of aptitude and training, 
and the measurement of success (25, Chapter 10). The procedures utilized 
in the AAF Aviation Psychology Program for the measurement of achieve- 
ment and the prediction of human behavior will be of interest to research 
workers in psychology and education. Flanagan has discussed these along 
with the statistical technics and experimental methods employed (25, 
Chapter 12). He has also commented on several types of research studies 
leading to the design of equipment for maximum efficiency (25, Chapter 
11). Altho to many research workers, much of the experimental work on 
the design of equipment may appear to be elaborate (and, therefore, ex- 
pensive) demonstrations of the obvious, Flanagan believes that much work 
will be carried on in this field in the future. It would appear that this is 
likely, since its application in industrial psychology is clear. 

Guilford has published considerable material concerning the general con- 
clusions and implications drawn from testing and classifying aviation 
cadets in the AAF (43, Chapter 29). The discovery of aptitude and 
achievement variables was reported (42) and Guilford and Zimmerman 
(46) listed twenty-seven factors found by centroid analyses of a number 
of different correlation matrices based on scores from tests administered to 
highly selected men in aviation-cadet training. The factors have been iden- 
tified subjectively by the authors and their co-workers in the AAF Aviation 
Psychology Program. In the case of ten of these factors the authors believe 
the names chosen for them may be reasonably accurate descriptions. The 
practical value of well-established psychological principles was demon- 
strated during the war, in the opinion of Guilford (43), who drew some 
lessons from aviation psychology. In another publication (44), Guilford 
mentioned findings that confirmed long-established principles of test theory ; 
namely, that test validity coefficients are more important than test reliability 
coefficients for evaluating tests and that the value of a test for multiple 
selection should be judged in terms of its unique contribution to accuracy 
of prediction rather than in terms of its validity coefficient. There will be 
general agreement on these points, but whether factorial analysis pro- 
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vides the means of reaching the desired objectives may be a matter for 
further discussion among test technicians. 

Davis prepared for a commission of the American Council on Educa- 
tion a brief description of the selection and classification procedures used 
in the armed forces together with their implications for civilian education 
(15). He indicated that the technics for selecting and classifying aviation 
cadets in the AAF Aviation Psychology Program constituted the first 
practical demonstration of the principles that are likely to form the basis 
for soundly conceived instruments useful in educational and vocational 
guidance in the future. For differential selection and classification of 
personnel it appears likely that tests will be developed to measure the 
variance common to the several criteria to be predicted and to measure 
separately the variance that is unique to each one of the criteria. The 
relative weighting of the tests measuring common and unique variance 
will depend on the proportion of the available manpower that can he 
rejected entirely. To secure measures of unique variance in each criterion 
it is not enough to make use of tests that are merely independent; in prac- 
tice, such tests will probably be constructed by correlating individual 
test items (that measure as nearly as possible only one mental function 
and that are maximally reliable) with each one of the criteria to be 
predicted and building up groups of items that have correlations as high 
as possible with one criterion and as low as possible with all other 
criteria. This is the logical extensicn to test construction for purposes of 
differential classification of the principles employed to construct the 
Aviation Cadet Qualifying Examination for purposes of selection alone. 
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CHAPTER IV 


The Personnel Research Program of the Adjutant 


General’s Office of the United States Army 


E. DONALD SISSON 


Tae CONTRIBUTIONS of the Personnel Research Section of the Adjutant 
General’s Office in World War II are reviewed in this chapter. This pro- 
gram was established in 1940 in the Adjutant General’s Office with the 
advice of the Committee on Classification of Military Personnel of the 
National Research Council. This Committee, of which W. V. Bingham was 
chairman, included C. C. Brigham, H. E. Garrett, L. L. Thurstone, L. J. 
O’Rourke, M. W. Richardson, and C. L. Shartle. 

The six main sections of this chapter present the work of the staff of 
the Adjutant General’s Office on selection (I) and classification (II) pro- 
cedures, training (III), the measurement of proficiency (IV), leadership 
(V), and tabulating and analysis technics (VI). Since the contributions 
of this group in the form of numbered pamphlets in the Personnel Research 
Section Report series are anonymous, the individuals who served on this 
staff from 1940 to 1946 are listed in the accompanying footnote.* 
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I. Selection 


Induction Standards 


The differentiation of those who could learn to be proficient soldiers in 
a reasonable length of time from those of insufficient learning capacity 
for such service was a preliminary necessity in the utilization of man- 
power in the war effort. In order to distinguish enlistees who could learn 
duties of a soldier in the usual amount of time (Army General Classification 
Test Grade Ill) from slow learners (Army General Classification Test 
Grade IV), Classification Test R-1 was developed (35) from AGCT-la 
items with item-grade correlations of .35—.65. It was standardized (61) 
in June 1941. Critical scores equivalent to AGCT standard scores of ninety 
and one hundred were derived. Another form, R-2, was prepared from 
AGCT-1b in February 1942 (106), and similar critical scores established 
(119). Forms R-3 and R-4 were ready in May 1946 for use with men 
enlisting or reenlisting in the Army, and the relationship of these forms 
to AGCT-3a was studied (310). Placement and achievement tests in readin 
and arithmetic were constructed (254) for each of the four levels o! 
training given in Special Training Units (STU) which were set up to 
teach illiterates possessing learning ability. Preliminary research was 
extensive; experimental forms were studied for item content, item-analyzed 
for difficulty, and validated. Standard score scales were constructed for 
these tests (250, 251). 

Literacy Tests. Attempts were made to determine minimum literacy re- 
quirements for acceptance for induction, and to develop measures of 
mental capacity not dependent upon higher literacy levels. Minimum 
literacy tests were constructed early in 1941 to eliminate those unable 
to read at the fourth grade level. Critical scores were determined (56) 
using the Metropolitan Advanced Reading Test, Form A, as the reading 
ability criterion. Minimum Literacy Test (Form 1) scores of enginee1 
trainees at Ft. Belvoir in August 1941 were studied in relation to ratings 
of unsatisfactory, satisfactory, and outstanding on fourteen criteria obtained 
from training records (55). A tetrachoric r of .45 between those passing 
and failing the test and the percent above and below the median training 
rating indicated some relationship between success on literacy test and 
success on “job.” A sharp increase in the percent of unsatisfactory ratings 
below fourth-grade level indicated this minimum reading ability was a 
reasonable critical level. Two forms of a verbal measure of general learning 
ability, Qualification Test Q-1 and Qualification Test Q-2, were released 
in June 1943, replacing the “pure literacy” test. Each test contained items 
on paragraph reading, arithmetic computation, and general orientation. 
Critical scores were established by which men were accepted, rejected, 
or assigned to special training. The percents of 3311 men tested in induction 
centers in three service commands scoring in the critical score intervals 


were tabulated (212, 242). The relationships of Q-1 and other tests to 
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level of training in Special Training Units were studied (192) in Sep- 
tember 1943. 

Nonverbal Group Mental Ability Tests. Visual Classification Tests VC-1, 
X-1; VC-1, X-2; VC-1, X-3; and VC-la were nonlanguage tests con- 
structed to select a quota of illiterates with sufficient mental capacity to 
absorb army training. The item types included visual perception, paired 
comparison, and abstraction. Revised forms were developed from item 
analyses of the preceding forms (137, 157, 158, 1600). VC-1, X-2 was 
standardized on a population of 764 men containing Negroes and whites 
in a ratio of approximately five to one (138). A lower critical score was 
set to exclude the lowest 2 percent of the Army GCT population, an upper 
critical score corresponding to an AGCT standard score of sixty—Grade IV. 

Individual Tests of Mental Ability. The Wechsler Self-Administering 
Test was found too difficult, with too narrow a range of scores among low- 
grade men. Study and item analysis (112) in March 1942 showed correla- 
tions of .83 with the AGCT for unrestricted range of 1250 men and .23 
for restricted range of 375 low-grade (Grade V on AGCT) men. Its 
validity as a predictor of soldier performance ratings in Special Training 
Units was very low (218). Low correlations with these ratings of other 
Army induction tests (216), suggested the possible inadequacy of ratings. 

Over-All Studies. In validation studies (test scores with ratings in Special 
Training Units) of groups of tests (209, 210, 211, 213), the tests with some 
verbal component were better than the others in screening the unsatisfac- 
tory STU men. Biserials between test scores and rejection after STU train- 
ing or graduation ranged from .38 to .72 for the tests in use December 
1943 to February 1944 (214). Various combinations of tests gave multiple 
correlations well above .60. Qualification Test Q-1, dependent to some 
extent upon literacy, was the best predictor. Standardization data (215) 
were obtained on three tentative test batteries. The results of the new induc 
tion program in June 1944 showed higher rejection rates than the old 
programs—approximately a 3.5 percent difference each month (242). 
Also, the educational inferiority of the southern selectee was evidenced 
by comparative rejection rates of Negroes and whites. 

Recruiting Standards for the Women’s Army Corps (WAC). Several 
mental alertness tests were developed for selecting women for the WAC. 
Women’s Classification Test WCT-1, X-2 (first designated Mental Alertness 
Test MAT-1, X-2), used in the selection of both enlisted women and officer 
candidates, was standardized (150) in 1942. The test had a Kuder-Rich- 
ardson reliability of .94 and correlated highly with the AGCT, Otis Group 
Intelligence, Otis Self-Administering, and the ACE tests (150). A revised 
form, WCT-2 used only to select enlistees had a Kuder-Richardson relia- 
bility of .97 and correlated .85 with the AGCT (199). It was standardized 
(202) on 12,000 applicants in October 1943. WCT-2 was superior to the 
AGCT in predicting academic grades in WAC officer candidate school, 
but neither of the tests predicted leadership ratings (191). In 1944 a 
short recruiting test, Classification Test R-1, replaced WCT-2. R-1, which 
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has also been used for regular Army recruitment (v.s.) had a reliabilit, 


of .94 and correlated .78 with WCT-2 (197). 


Selection for Specialist Training 


A meteorology aptitude test was used by the Air Corps thru the war fo: 
the selection of weather observer students. This test battery, consisting of 
a mental alertness test of the traditional type and fifty meteorology and 
144 physics true-false items, gave adequate validities and reliabilities (23] 
Aircraft Warning Aptitude Test TC-10A contained a section on locating 
grid points by coordinates and a section on plotting coordinates. The first 
part proved valid against the criterion of theoretical grades in courses, the 
second part against performance grades. Aircraft Warning Classification 
Test TC-lla was given to those who passed the previously mentioned test 
for classification into potential specialist categories. More than 90 percent 
of failures were eliminated (226). Among the specialized tests used i 
small, sometimes unsuccessful, programs to select trainees for highly 
specialized Army courses, were those for Balloon Barrage courses (129). 
Combat Intelligence courses (181, 196), Military Police courses (171) 
and Medical Technicians (162). A battery including the AGCT and 
several mechanical aptitude tests was investigated for use to select Ai: 
Corps bombardiers and navigators. Paper-and-pencil tests were found 
to be related (73) to academic course grades but not to flight-training 
records. However, reliability of the latter criterion was low. Research in 
this area was subsequently transferred to the Air Surgeon’s Office in 
December 1941. A large-scale comparative study of apparatus and writte: 
tests was conducted for the purpose of validating and standardizing an 
aptitude testing program for Air Corps basic-training centers. The written 
tests were generally superior to the apparatus tests against the criterion 
of academic success in training courses (227, 228, 229). Tests finally 
chosen for the battery contained only two performance tests out of a large 
number tried: (a) Nut and Bolt Manual Dexterity Test TC-5a, and 
(b) U-Bolt Assembly Test TC-6a. An attempt was made to validate th: 
instruments against on-the-job performance as judged by five types of 
supervisory ratings of Air Force ground-crew men in active units. The 
written tests showed lower validity than with the criterion of academi 
success. All validities were much lower than in unselected, untrained 
populations (290, 291). The U-Bolt Assembly Test appeared to be of 
some promise. Tests of informal information in shop mechanics, auto- 
motive and driver information, electricity, and radio, originally intended 
for use together with the latest form of the AGCT to form a comprehensiv: 
basic classification battery in initial processing, have instead been adopted 
for use at training centers to select men for specialized training. Two forms 
of the Automotive Information Test (Al-l and 2) and of the Shop 
Mechanics Test (SM-1 and 2) have been standardized on large populations 
(252). Extensive item analyses and validity studies of all four tests have 
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also been conducted (228, 236, 280, 281, 282, 283, 284), with adequate 
validities obtained. A preference record and a self-description form based 
on the forced-choice technic were validated against a production index 
and a 3-point rating in a study of selection instruments for personnel 
suitable for recruiting work (289), with disappointing results. 

Radio Code Operators. Investigations by the Personnel Research Section 
and various other agencies have resulted in the authorization for Army-wide 
use of two tests for selecting radio code operator trainees. Many other 
code aptitude tests have been considered. The criteria used for the validities 
reported here include number of hours to reach specified receiving speeds, 
final code speed attained, and the NDRC Code Receiving Tests. Usually, 
several of these were considered in each study (15). The Signal Corps Code 
Aptitude Test (SCCA) evolved from a test tried out by the Signal Corps 
between 1924 and 1931. By 1941 the SCCA was widely used by several of 
the Arms and Services. Usually administered by phonographic transcrip- 
tions, the SCCA was a code discrimination type test containing seventy-eight 
pairs of patterns to be identified as “same” or “different.” Reliability esti- 
mates by Kuder-Richardson Formula No. 21 were much lower than 
desirable, ranging from .67 to .78, except for one sample for which .88 was 
reported (60, 91, 140, 184, 195). Validity data varied considerably from 
one sample to another, with coefficients from —.03 to .57 (18, 60, 83, 92, 
139). Data reported by the Signal Corps for testing between the wars gave 
validities ranging from .54 to .75 (105). Little improvement in reliability 
or validity resulted from doubling the SCCA to make the Radiotelegraph 
Operator Aptitude Test ROA-1, X-1, which was authorized for Army-wide 
reception center use in July 1942. The Kuder-Richardson reliability was 
.87 and reliabilities by the split-half method ranged from .73 to .82 (146, 
184, 195). Validity of ROA-1, X-1 was only “ ‘r, usually around .30 (161, 
186, 195, 224). The test was standardized on the basis of SCCA results, 
standard scores being set to a mean of 100 and a sigma of 20 (102). 
Studies indicated that previous musical instrument experience as well as 
code experience were positively related to radio code test scores and 
added to success in radio code training (60, 184). Data from numerous 
radio operator specialist schools indicate that fewer failures result if men 
are preselected on ROA-1, X-1 plus AGCT rather than AGCT alone (224). 
Army Radio Code Aptitude Test, ARC-1, a code learning test developed by 
The National Defense Research Council (18) took the place of ROA-I, 
X-1 toward the end of 1944. The test required the recognition of three 
learned Morse Code letters when presented with unlearned characters. 
Validities for ARC-1, usually between .50 and .60, were higher than those 
for ROA-1, X-1 or the Thurstone Code Aptitude Test (224). A check on 
the standardization sample resulted in the raising of raw score equivalents 
for the various standard scores (248). A series of Code Learning Tests, 
work-sample tests based on the same principle as ARC-1, showed consider- 
able promise but were never carried to the completion stage. Reliabilities 
by Kuder-Richardson Formula No. 21 and by estimation from odd-even 
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correlations ranging from .94 to .98 have resulted for various editions of the 
test (65, 91, 128, 140, 161). Validities have, in general, been as good as for 
either the ROA-1, X-1 or ARC-1 (65, 140). A paper-and-pencil alphabet. 
symbol Substitution Test-1, X-1, also developed by the Personnel Research 
Section, gave reliabilities over .95 but was somewhat less valid than ROA-1, 
X-1 (128, 140, 161). A revised edition of a Code Rhythm Test developed 
by Thurstone has also shown some promise (105, 128). The Thurstone 
Code Aptitude Test was tried out in studies on ARC-I] and a revision 
of this test, designated ROA-2, X-1, was accomplished (205). Both tests 
were highly reliable but the original Thurstone test was more valid. 

Truck Drivers. Research resulted in the standardization and validation 
of a group of tests (16), including a Driver Experience Inventory, a Drive; 
Information Test, tests of visual acuity and night vision, and a reaction- 
time test. Other well-known psychophysical tests were assessed as predictors 
of driving ability. Most frequent criterion was a road test, consisting of 
fifteen to twenty minutes observation of driver in the standardized situa. 
tion. Specific tests were checked on a Road Test Check List. A score 
consisting of checks of correct or incorrect operations, weighted or un. 
weighted was obtained, plus an over-all rating. Reliability coefficients fo: 
the road test were not as high as those usually obtained for objective tests. 
but equalled those usually obtained for criteria in validity studies (24. 
64, 72, 86, 126). Two forms of a Driver Information Test (DIT) were 
standardized (172, 176). Trial of personal history items showed driving 
experience items to be most valid (25, 72, 123, 134, 152). A Driver Experi- 
ence Inventory showed variable validity, fairly high in certain populations 
(120, 147, 172). 

Visual Acuity. Several of the more familiar visual acuity tests gave con- 
sistently low correlations with the road test (25, 72, 126). Tests of night 
vision gave higher. validities against a special night road test (71, 76, 
131). Studies of race differences in night vision (25, 78, 121) produced 
no consistent or significant results. High sugar intake showed no effect on 
night vision (72). Studies are currently in progress on the standardiza- 
tion of new tests of visual acuity and night vision. 

Sensori-Motor Tests. Data show low positive and zero correlations and 
some inconsistency from sample to sample in studies of several sensori- 
motor tests as predictors of ratings of driving ability. However, popula- 
tions were often small and criteria not always reliable. In addition, soldiers 
are already a physically selected population (25, 71, 126). 


Trade Knowledge Tests 


Numerous editions of tests in electricity, radio, and automotive mechanics 
have been developed to aid in the identification of those with interests and 
aptitudes in these fields, as evidenced by possession of informal informa- 
tion. A General Electrical and Radio Information Test was constructed 
after item analysis of experimental forms (100). Subsequently, separate 
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series in electricity and in radio were established. Limited analysis revealed 
some fair validities (227, 239). A General Automotive Information Test 
yielded a correlation of .67 with course grades of 147 men (110). The test 
was further expanded and item analyses and additional validity studies 


were carried out (122, 142, 169, 183, 239). 


West Point Qualifying Examinations 


Each year since 1942 a new form of West Point Qualifying Examination 
(WPQ) has been constructed for administration along with the regular 
West Point examinations. It is intended that this battery eventually re- 
place present West Point entrance examinations. The latest form of the 
WPQ contains two subtests, Language Aptitude (learning an artificial 
language), and Elementary Mathematics (the use of short-cuts in arith- 
metical and algebraic processes). Each year’s series was tried experi- 


mentally and prevalidated on classes already selected and attending the 
Academy and administered in final form to applicants the following year. 
Additional validity data were gathered subsequently, with academic success 
as criterion (164, 174, 182, 190, 219, 220). 


Selection of Warrant Officers 


Objective examinations have been developed in over thirty administra- 
tive and technical military specialties for the selection of warrant officers. 
The subjects range from Auditing and Accounting to Weather and Cryp- 
tography. No reliability or validity data have been gathered on these tests, 
altho they were constructed with the aid of technical experts and have 


been widely used (237, 275, 276). 


Personality Studies 


In 1942, Personnel Form P-1, also known as the Shipley Personality In- 
ventory, was proposed as a group test for military use in differentiating 
troublemakers, neurotics, and normals. Extreme troublemakers and ex- 
treme neurotics were identified by this instrument (149), but reliability 
(136) was very low. More intensive research in personality measurement 
by the Personnel Research Section was begun toward the end of 1943. 
The Minnesota Multiphasic Personality Inventory, adapted and revised 
for Army use, the Cornell Selectee Index, the Army Individual Test (See 
Chapter II), and the Biographical Information Blank (See Chapter V1) 
were among the instruments studied (61). The Multiphasic Personality 
Inventory, a paper-and-pencil objective test, scored separately for each of 
nine psychiatric classifications, showed promise; its items were analyzed 
and some selected (294) for inclusion in the Biographical Information 
Blank used in the regular army officer retention program (See Chapter 
II). Validity studies of the Multiphasic Inventory were made in connec- 
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tion with predicting AWOL’s and psychiatric referrals among basic trainees, 
predicting psychiatric rejects among WAC applicants, selecting trainees {or 
Arctic duty, and validating against careful psychiatric diagnoses, Results 
have been favorable enough to justify further study and development of 
the inventory. An item analysis of responses of “good” and “bad” WAC 
applicants did not show significant differences (314) . Correlational analyses 
of the Army Individual Test, composed of six separate subtests and ad. 
ministered to a population of psychoneurotics and psychotics, suggested 
the validity of the A/T for differentiating between psychoneurotics and 
specific psychiatric diagnostic categories, particularly when products and 
squares of test scores, rather than sums and differences, were used (313). 
The Army Wechsler subtest scores were found to add little to predictions. 


II. Classification 
The Army General Classification Test 


The Army General Classification Test (3, 7), providing an index of the 
learning ability of recruits to facilitate classification for training and joh 
assignment, was first released as AGCT-la in October 1940. Subsequent 
forms, including two Spanish versions (258), were issued during the war: 
AGCT-1b in April 1941; Ic and 1d in October 1941. These consisted of 140 
to 150 multiple-choice items on vocabulary, arithmetic, and block counting. 
Raw scores were converted to standard scores with a mean of 100 and a 
standard deviation of 20. Standard scores were divided into five Army 
Grades. A revised series, in which part scores were recorded for the first 
time, appeared as AGCT-3a in April 1945 and AGCT-3b in 1946 (236, 278). 
The AGCT-3 series contained four tests: reading and vocabulary (189). 
arithmetic computation, arithmetic reasoning (235), and pattern analysis. 
The total score was the equivalent of the AGCT-1 score while part scores 
were also used in classification. An information battery, originally intended 
for inclusion in the AGCT-3, was used instead for classification purposes 
at training centers. Four forms of each type of subtest in the AGCT-3 wer 
developed and equated for content and difficulty. 

Standardization. Standardization of Form la (31) was accomplished. 
before the first inductees under the Selective Service Act entered the Arm) 
on a population of regular Army and CCC men equated to the expected 
Army population by weighting on age, education, and area of residence. 
Due to several other factors which could not be taken into account, e.¢.. 
race, occupational deferments, illiteracy, direct commissions, the distribu- 
tion curve of the actual Army population varied from the expected. Despite 
this variance, the conversion table for Form 1b (40) was computed by 
combination regression of Ja and 1b scores, because the norms for Ja were 
already widely used for classification. Standardization of Forms Ic and 
ld (42) was based on Ja and 1b. Distributions on Forms Ic and Id had 
less negative skewness, and the conversion tables were set up to yield 
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Army grade percent midway between the old and the new forms. Improved 
discrimination of Ic and Id was partly due to more equitable distribution 
of item difficulty. Form 3a was standardized (236) on a population of 
39,000, carefully stratified and weighted by age, education, race, and 
geographical location. 

Item Analyses. Studies of response frequencies, item difficulty, dis- 
criminating power, and item-test consistency (29, 30, 35, 41, 115) were 
used for guidance in construction of alternate forms. It was found that 
equal scores might represent widely differing performances in type of 
questions answered (143, 117). Most extensive item analyses were made 
on the four trial forms for the AGCT-3 (236). Final form items were care- 
fully graduated in difficulty and selected on the basis of item-total test 
correlation. 

Practice Effect. Study of practice effect on Ja and 1b scores (39) and Ic 
and Id scores (42) showed small but consistent increases regardless of 
which form was taken first. Retesting after considerable lapse of time for 
Grade V men (173) and men in OCS (163) showed similar results, which 
were attributable to factors other than the effects of Army training. 

Part Scores. Altho part scores of the AGCT-1 were not used for classifi- 
cation, investigation was made of relative contributions, discriminative 
power, intercorrelation, reliability of parts, and correlations with part and 
total scores of other forms (36, 38, 114). Each part was found to make a 
significant contribution. Combined vocabulary and arithmetic scores of 
one form were found as good as total scores for predicting total scores on 
a second form. 

Reliability. Repeated reliability estimates on all forms by Kuder-Richard- 
son Formula No. 21 (31, 40, 42, 236), odd-even comparisons (31), retest 
(75), alternate forms (38, 42, 236), and Kuder-Richardson Formula No. 
2 (236) placed the reliability generally above .90. 

Validation. Several hundred validity coefficients attest to the value of 
the AGCT in selecting men for a large number of Army specialist courses 
(27, 37, 57, 68, 73, 77, 89, 92, 94, 97, 99, 108, 113, 129, 132, 174, 175, 
176, 178, 201, 213, 223, 226, 277, 324, 336, 338). Most of the populations 
were preselected either on the AGCT or on some highly correlated factor. 
The criterion was usually academic grades. Where preselection was rigorous, 
correlations were lower. Validities for criteria involving personality, e.g., 
success in Officer Candidate School (99, 132, 175, 198), or formal academic 
background, e.g., success in the Army Specialized Training Program (324, 
336, 338), are low. AGCT-3a (227) was generally superior to AGCT-1. 
The reading and vocabulary subtest correlates highest with written ex- 
aminations, and pattern analysis is usually the best predictor of practical 
performance. Use of part or combined subscores in classification is ques- 
tionable because of high subscore intercorrelations. 

Relationship with Other Variables. Studies show high correlation with 
education (31, 118, 127, 136, 270) and with other well-known tests of 
mental ability (32, 34, 103, 104, 165, 257, 331), decreasing with restric- 
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tive preselection, but no significant relationship with age (31, 118, 236) 
except that in highly selected groups, correlations tended to be slight) 
negative. Comparisons of male and female Army populations in age, cul. 
tural and educational background, selection methods, and geographical 
distribution, had inconclusive results (236). Comparisons of Negroes and 
whites (117, 270), complicated by social, cultural, and educational diffe; 
ences, showed lower mean scores for Negroes, the difference decreasing 
where educational status was matched. Mean scores for northern soldiers 
of both races are higher than for southern soldiers of the corresponding 
race. An early tentative study of relationship to civilian and military oc. 
cupations (26) was made. Later studies showed a definite occupational 
hierarchy and sectional differences within occupations (270), despite con 
siderable overlap, even between highest and lowest ranks; but no relation. 
ship to age or experience was found. Variability of scores was higher i: 
lower level occupations. Occupations with restricted score ranges probab)) 
depend on abilities measured by AGCT; others with wider ranges depend 
more on specific interests or aptitudes. For counseling purposes a low scor 
was considered possible ground for avoiding a high level occupation, but 
a high score per se is no ground for avoiding any occupation. 

Special General Classification Tests. A special Non-Language Test 2aly 
to test illiterates and Grade V men was standardized (59) on a population 
with a normal distribution of AGCT scores. An Army Individual Tes 
(AIT) of general mental ability (16, 17) consisting of three verbal and 
three performance tests (221, 222) was standardized (230) on a group of 
1000 native-born literate whites. A study on a small population (222) 
indicated that the test could discriminate between Grade V men in Special 
Training Units who were likely to succeed or likely to fail in Army 
training. 


The Mechanical Aptitude Test (MAT) 


The general Mechanical Aptitude Test MA-1 appeared in February 1941. 
Forms MA-2 and MA-3 were released in October 1941. A later form, M 4-4, 
X-1, was built for WACs. MA-1 consisted of items on mechanical move- 
ments (54), surface development, and shop mathematics. MA-2 and MA-3 
differed considerably from MA-1, containing mechanical information (23. 
53), mechanical comprehension (51, 50), and surface development; of 
which the first two were found to be good predictors for mechanics courses 
(58). MA-4, X-1 contained items on tool recognition, mechanical compre- 
hension, and surface development. Use of the MAT at reception centers. 
where scores were recorded for all except illiterates and Grade V men 
on the AGCT, continued until April 1945. It was supplanted by the 
AGCT-3a, which contains a surface development section similar to the 
MAT. Thereafter the MAT was used whenever deemed advisable at train- 
ing centers. 

Standardization. MA-1 was standardized on 3452 men (47). Standard 
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scores, with a mean of 100 and a sigma of .20, were calculated by equiv- 
alent percentiles yielding a breakdown in five Army grades which approxi- 
mated a normal distribution. MA-2 and -3, based on item analysis of trial 
forms (70, 82), were standardized (90) by equivalent percentiles for MA-1 
scores on a population of 2766 men. MA-4, X-1 was standardized by linear 
transformation (154), and was item analyzed (180). 

Reliability. Estimates by the Kuder-Richardson Formula No. 21 (46, 47, 
90, 154), test-retest method (49, 163), and equivalent forms method (90) 
show satisfactory reliabilities for both total scores and subtests. 

Validity. Validity studies gathered a wide range of correlations, usually 
lowered by preselection, with course grades and other criteria. As a verbal 
test, the MAT correlates best with theoretical course grades (66, 227) and 
motor mechanics (48, 49, 89, 97, 103, 125) ; less well with driver perform- 
ance ratings (156, 172); and negligibly with radio code receiving speed 
(84, 201, 205). Varying results were obtained for clerks (94), aircraft 
warning operators (226), airplane mechanics (194, 201, 228, 229), basic 
trainees (68), and Navy trainees (52, 129). Validity of MA-4, X-1 with 
WAC specialist school grades as criteria (186) was superior to the AGCT 
for motor transport, but inferior for radio repair school. A study of MA-4, 
X-] for civilian armament trainees (239) found that its validity would be 
improved if the Surface Development Subtest were omitted. 

Relationship to Subtests and Other Tests. MA-2 and MA-3 were found 
to be superior to MA-1 in being less highly correlated with the AGCT (47, 
69, 87, 90, 95, 96, 226). Intercorrelations of total scores and subtests were 
high (47). Correlations were computed for the MAT and civilian mechani- 
‘cal aptitude tests (49, 52, 226, 227). 

Tests of Mechanical Aptitude for Civilians. A provisional battery, 
Mechanical Aptitude Test MA-5, was not as valid as standard civilian 
mechanical tests (250). A mechanical aptitude battery consisting of Learn- 
ing Ability Test LA-5 (an Air Corps test of mental ability), Tool Usage, 
Mechanical Problems, and Paper Form Board Test CG-106a did distinguish 
mechanics from nonmechanical workers (260). General Mechanical Apti- 
tude Test CM-142a yielded fair correlations with both final mechanical 
grades and supervisors’ ratings (263). A revision of this test, made shortly 
after V-J day, was designated CM-142ar (264). 


Clerical Aptitude 


Clerical Aptitude Test CA-1, completed in 1940, consists of 280 items on 
name checking, coding, catalog numbers, verbal reasoning, number 
checking, and vocabulary. It was standardized (20) on a group normally 
distributed by AGCT scores. Resulting distribution (63) was leptokurtic 
for both the standardization sample and field returns. Reliability by Kuder- 
Richardson Formula No. 21 was .95 (20), by test-retest method .72 (163). 
Validity coefficients are usually based on small populations, the criterion 
being clerical school grades (19, 47, 69, 94, 96). Since it is highly corre- 
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lated with the AGCT (21, 47, 69, 94, 96) and sometimes inferior as 
predictor of clerical grades to the AGCT, the Mechanical Aptitude Tes; 
(94), or the Wells Revision of the Army Alpha (19) its usefulness has 
been questioned. Experimental material for an uncompleted alternate form 
of CA-1 was used in constructing Clerical Aptitude Test CA-2, X-2 for the 
WAC, which covered classifying, cataloging, number and name checking. 
alphabetizing, and spelling. In the standardization the score distribution 
departed markedly from the normal, and the percentile method was used 
to set up standard score scales. The Kuder-Richardson reliability was .97 
(154, 179). Validity studies (186) with grades in specialist courses as 
criteria showed CA-2, X-2 to be inferior to the AGCT for administrative 
specialists. More widely used than any of the above clerical aptitude tests 
were those developed for civilian employees of the War Department, which 
in chronological order of use were the General Proficiency Test WCT-I, 
X-3 (151, 155); CA-2, X-2 (179, 185); a provisional battery, Clerica’ 
Aptitude Test CA-3 (261); and finally General Clerical Abilities Tes: 
CC-105a (241, 247, 262, 292). Part A of CA-3 correlated .32 with super. 
visors’ ratings in some jobs (233, 240). Parts of CC-105a had an average 
correlation of .35 with Civil Service CAF grade and supervisors’ ratings, 
but in a number of cases reached correlations around .50 (259, 293, 317). 


Army Trade Screening Tests 


To verify skill status in Military Occupational Specialties a series of 
Army Trade Screening Tests and Experience Check Lists in clerical, 
mechanical, and other technical fields was developed (10, 286, 287, 288). 
Reliabilities, estimated by Kuder-Richardson Formula No. 4 for eight of 
the tests, ranged from .87 to .93. Critical scores were set for most of the 
tests to represent the level of technical achievement attained by graduates 
of the corresponding Army specialist course. Critical ratios between ex- 
perienced and inexperienced men were high. Critical ratios dropped when 
examinees were encouraged to guess. 


Ill. Training 
Measurement of Academic Knowledge 


Measures of educational achievement in the armed forces gained a new 
importance with the inauguration of the Army Specialized Training Pro- 
gram. Tests in academic subjects such as Algebra, English, etc. were con- 
structed for the Air Forces (67, 87, 93), Corps of Engineers (68, 116). 
Coast Artillery (142), and the WAC (170). These tests were used as earl) 
selection devices until instruments of better validity were developed. A 
study was also conducted on the difficulty level and usefulness of a General 
Educational Test for Warrant Officer candidates constructed by the Co- 


operative Test Service (249). 
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The Army Specialized Training Program: Selection Tests. At the incep- 
tion of the Army Specialized Training Program (8) a test constructed 
for officer candidate selection, the OCT-2, X-3, was found useful as a gen- 
eral ability test (318, 319). It was a considerably better predictor of suc- 
cess in basic engineering courses than the AGCT (324, 334). For the Army 
Specialized Training Reserve Program, three Army-Navy Qualifying Ex- 
aminations were constructed by the College Entrance Board for screening 
applicants (320, 323, 331). A qualifying test (C-4), composed of mathe- 
matics and vocabulary items, was prepared by the Personnel Research Sec- 
tion for the same purpose, and was found to discriminate satisfactorily 
(341) among applicants. A Mathematics Inventory Test, from which the 
mathematics section of C-4 was derived, was used for placement in appro- 
priate curriculums, at the proper level of difficulty, and proved to be a good 
predictor of success in the ASTRP (340). A series of aptitude tests for 
professional medical training was also built (328, 329) and their relation- 
ship to AGCT (325, 326) and other educational factors (330) was studied. 
Certain achievement tests in mathematics and physics were used as selec- 
tion devices for some advanced courses. 

Achievement Tests. More than 150 different national achievement tests 
covering at least eight subjectmatter fields (6) were administered in all 
basic and advanced courses as a check on uniformity of content and ade- 
quacy of instruction given in approximately 200 different training units. 
These included tests for seven different foreign languages. A series of 
studies recorded reliabilities for the tests (6). Attempts were made to 
develop valid norms for test scores (322, 332). Item analyses on prelimi- 
nary forms (327, 337) aided in constructing more reliable forms of the 
tests. The validity of the tests as predictors of success in basic engineering 
was investigated (321). Studies of correlations between R and R-4W 
showed that guessing had little effect on test reliability (333, 335). A socio- 
economic study (339) determined that 30 percent of the trainees were 
receiving more education than their prewar plans contemplated. One result 
of the national achievement testing program was that many instructors 
who originally opposed objective testing in college courses came to accept 
its value. 


Military Training 


A Military Knowledge Test consisting of multiple-choice items and or- 
ganized in pictorial form thruout was developed to test the basic military 
knowledge required of all soldiers. This test evolved from an item analysis 
of several experimental forms. It was used as a device to determine whether 
men being redeployed needed refresher training. The test distinguished 
trained and untrained infantrymen; however, validities were around .30 
against the Soldier Performance Scale (See Chapter IV). 

Army Automotive Screening Battery. An Experience Check List, and 
Apprentice Mechanics Test, a Tool Usage Film Strip Test, and Distributor 
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and Valves Assembly and Use of Tools performance tests, administered }) 
the “successive hurdle” method, have been used with great success to 
screen army automotive students who could bypass elementary phases of 
training. Based on trial of many experimental forms and methods of scor. 
ing, the battery showed high validity (255). A subsequent follow-up study 
showed that students selected to skip beginning phases of training on the 
basis of these tests completed the course even more successfully than stu- 
dents taking the entire course (256). 


IV. Measures of Proficiency and Criteria 
Truck Driving and Machine Shop Performance 


Performance criteria for truck driver trainees consisted of a practical 
road test checklist with objective ratings on specific items and a general 
driving proficiency rating, usually on a 5-point scale. The reliabilities of 
road test ratings are as high as those usually available for practical pe: 
formance criteria. An early attempt at extreme objectification (64) was 
abandoned because of poor results. In one study the biserial correlation 
between number of unsatisfactory items on a checklist and general ratings 
was .83 for 1982 men and .28 for 1454 men rated under somewhat differ- 
ent conditions (72). Weighted checklist scores had tetrachoric correla- 
tions between .51 and .82 with general ratings for a sample of 1717 men 
(86). Other reliabilities are recorded using the split-half method (126) 
and test-retest method (24) on checklist scores. The general conclusion 
was drawn that reliability can be increased by training the examiners. 

Three raters were used to rate examinees on performance on a list of 
common machine shop operations. Estimated agreement among raters was 
fair. Average reliability of all three raters was .80 (107). 


Soldier Performance Report 


Major use of the Soldier Performance Report was as a criterion for pre- 
dictor tests such as the Army General Classification Test and the Army 
Individual Test in an effort to screen potential satisfactory soldiers from 
poor risks before basic training. Two early experiments (98, 203) were 
reported. Another study, validating a group of induction station tests (208), 
used a scale restricted to marginally satisfactory and unsatisfactory ratings. 
Contingency coefficients of reliability ranged (corrected) between .64 and 
-78. Somewhat lower coefficients were obtained for a much more restricted 
group on AGCT scores (211). In the validation of induction station tests 
an 8-point scale and a composite checklist were used (210). To validate 
a military knowledge test, a soldier performance scale (279), in which a 
superior noncommissioned officer rated the soldier on a 5-point description 
scale, showed satisfactory reliability (above .80) by rating and rerating 
comparisons and correlations between ratings by platoon leaders and 
platoon sergeants. 


588 





December 1948 PERSONNEL RESEARCH PROGRAM OF THE ARMY 





AAF Technical School Success 


Attempts were made to find criteria more reliable than academic course 
grades for predictors of success in AAF schools. Paired comparisons (229) 
showed high reliability for small groups of ratees. Closest approximation 
to on-the-job conditions was tried (290) on technicians in AAF combat 
units in the Z/. Five types of on-the-job rating were secured: rank in over- 
all job ability, paired comparisons, and a five-step scale on performance, 
personality, and over-all worth. Odd-even reliabilities were high except 
for personality. Intercorrelations were about .90. 


War Department Civilian Employees 


Dissatisfaction with the reliability of criteria in use for test predictors 
in clerical work led to some experimental work on supervisory ratings. 
Two ability rating scales and a trait scale were constructed (293). They 
showed less correlation with predictor test grades than did civil service 
grade. 


Officer Efficiency 


A criterion originally developed to validate devices for measuring lead- 
ership and personality fitness among officers became the backbone of sev- 
eral programs of officer selection, retention, and efficiency reporting. The 
adequacy of the criterion depends on the agreement of groups of officers 
intimately acquainted with the character and proficiency of given officers 
as to their placement in widely separated positions along a continuum 
of over-all competence (343). It was determined that a random group 
of ten rating officers can distinguish the over-all competence of officers 
almost as well as can a designated group of ten selected raters. In order 
that assignment to any criterion group be reliable, the officer being rated 
should be known well enough to be rated by at least seven out of ten 
raters. This procedure was perfected in the “Buddy Rating System” (295, 
297) in an experiment with Officer Candidate School classes. Pooled inde- 
pendent ratings by a group of “buddies,” when checked against ratings 
by the platoon officer, yielded a highly reliable criterion against which 
to measure selection instruments. The corrected split-half reliability varied 
between .8] and .91, and correlations between buddy and platoon officer 
ratings ranged from .51 to .59. Greater reliability (295) was obtained 
as length of acquaintanceship increased. A system was devised for assign- 
ing a criterion index score of from 0 to 60 to include, in addition to definite 
criterion groups of high, middle, and low competence, those men of more 
indeterminate status (298). A variant of the original pooled rating sys- 
tem, comparison of the officer with Army Officers in general and with offi- 
cers of the same grade on a 20-point scale, was later developed and showed 
high correlation with the criterion index (266). 
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V. Studies of Leadership 


Officer Selection 


Tests, rating forms, officer evaluation reports, interviewing procedures. 
and other devices were developed and investigated for measuring back. 
ground, learning ability, and leadership qualities of officers and office: 
candidates. In 1941 two forms of the Higher Examination, H-1 and H-2, 
containing the most difficult vocabulary and arithmetic items, were con 
structed. These forms were intended for more exact discrimination among 
candidates in Army Grades I and II on the AGCT. Tables of equivalent 
scores with the AGCT were prepared (74). Both forms correlate highly 
with each other and with the AGCT. Reliability coefficients are high. Un- 
due emphasis on speed caused examinations to be discontinued for office: 
candidate selection because the speed factor appeared to discriminat: 
against the older men. Form H-1 correlated .48 with final grades for sixty- 
seven engineer officer candidates (79, 85, 88). War Orientation Test, 
WOT-1, X-1, containing 100 five-alternative items on information about 
current events, had high reliability and gave significant differences in means 
between officer candidates and basic trainees, but had lower validity than 
the AGCT as a predictor of success at Officer Candidate School (153). 

Officer Candidate Tests, OCT-1 and OCT-2. Experimental forms con- 
tained items on comprehension of paragraphs and graphic material and 
on arithmetic reasoning, which were chosen from the Army Officers Train- 
ing Examination, a battery developed for the War Department by the Co- 
operative Test Service. Reliabilities for the first experimental form were 
not satisfactory but higher correlations with OCS grades were obtained 
than for the AGCT (148). Conversion tables to AGCT scores were pre- 
pared and an item analysis made (118, 167). The test was rejected because 
informational content ‘was taken from commonly used War Department 
manuals. Two final forms, OCT-1 and OCT-2, were constructed after item 
analysis of two new experimental forms (168) and standardized on 2000 
men (175). Reliabilities were .81 and .91 respectively by the Kuder- 
Richardson Formula No. 21. Both forms correlated highly with the ACCT 
and with years of education in an unselected population. Validity coefli- 
cients were high, both tests being far superior to the AGCT as predictors 
of academic success in Officer Candidate School (198). 

Leadership Studies. Early approaches were based on analysis of War 
Department and civilian literature on leadership, management, etc. Two 
rating scales were developed but not validated. Interview procedures and 
forms were also developed and analyzed (177). Projective technics were 
investigated by the administration of the Rorschach and Thematic Apper- 
ception Tests, sentence absurdities, picture absurdities, and Philo-Phobe list 
to fifty-two men. Most correlations with leadership ratings were low. Prac- 
tical difficulties precluded the use of these instruments on a large scale 
(177). Preference Inventory, PL-1, X-1 contained 100 groups of three activ- 
ities, each presumably preferred by the combat leader, administrative 
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leader, or the nonleader. Correlation with leadership ratings at OCS were 
insignificant (204). Leadership Test, L-1, X-1, requiring judgment in lead- 
ership situations, was also discarded (178). Combat reports from the North 
African campaign and analysis of leadership selection by British and 
German armies and of civilian research in the United States led to reexam- 
ination of leadership selection methods. Suggestions were made by the sub- 
committee on leadership of the American Psychological Association Emer- 
gency Committee on Psychology. Ernest Ligon, Consultant to the Secretary 
of War, reported on lack of uniformity in current officer selection proce- 
dures. A Combat Adaptability Rating Scale was used in conjunction with a 
series of tests including an interview, performance situation, and stress 
situations. Reliability of ratings was high, but low correlations were ob- 
tained between the rating scale and tests (225). No follow-up studies were 
made of individuals in actual combat because of administrative difficulties. 

Officer Retention Program. The largest, most successful, and most revolu- 
tionary program in leader selection was worked out for the program of 
selection from among temporary officers of those to be given permanent 
commissions and integrated into the postwar regular Army (4). Personnel 
instruments developed include an Officer’s Application for Commission; 
an Officer Classification Test, OCT-14, a test of general learning ability 
of suitable reliability but not adopted for other reasons; a General Survey 
Test of general educational achievement, including material from the 
fields of English usage, humanities, physical and biological sciences, and 
social sciences; a Biographical Information Blank; an Officer Evaluation 
Report, an improved efficiency rating device; and a Standard Interview 
Procedure, a new type board interview which was objective, reliable, 
uniform, and completely different from usual Army board proceedings. 
The General Survey Test is used as an initial hurdle, while scores on the 
Biographical Information Blank, Officer Evaluation Report, and Interview 
are combined to yield a composite score indicating over-all fitness. All in- 
struments have been shown to be valid for representative officer samples 
against rigid criteria of agreement by fellow officers as to each applicant’s 
over-all fitness (13). A general bibliography on leadership was compiled 
as background for the program (9). 

Construction of Selection Instruments. Preliminary forms were tried out 
on approximately 8000 officers and officer candidates. Two 125-item forms 
of the Officer Classification Test (OCT), containing sections on reading 
comprehension, arithmetic reasoning, and interpretation and judgment, 
were each administered to groups of 500 officers. Two final 110-item forms 
were constructed on the basis of item analysis (306). A study of equivalence 
showed appreciable difference between forms (305). Form A had good 
validity for predicting success in technical courses (307), but did not 
correlate with general criterion of officer competence used in major study 
(see Validity of Battery, below). Form 1 of the General Survey Test (GST) 
contained 200 items selected from two preliminary forms after item analysis 
of 1000 cases (306). Percents of applicants selected by various cut-off 
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scores by Arm and Service and educational level were determined (311). 
The Biographical Information Blank (BIB) (297) provided a means fo; 
objectively measuring elements of past experience and personal char- 
acteristics, experimentally determined to be significant for predicting 
officer success. Form E, the final form used, contained 204 items divided 
into four parts: eighty-two biographical items, twelve pairs of self-evalua- 
tion items, ninety-four officer description items in forty-seven groups, and 
sixteen multiphasic pairs of items. All technics of personality measurement 
which had shown promise were investigated and the “forced-choice” techni: 
exploited. Nine types of items received preliminary trial. Self-description 
items were presented in quintets (295) containing two desirable, two 
undesirable, and one neutral alternative. The two desirable or undesirable 
characteristics were equal with respect to degree of desirability, but differed 
as to relative importance for officer success. Scale values had been obtained 
previously (296). One hundred ten pairs of items from the Minnesota 
Multiphasic Inventory, Form TC-8a, were tried (294). The criterion for 
item selection on the B/JB consisted of “buddy ratings” by fellow office: 
candidates and a ranking by platoon officers. Reliability of the “buddy 
ratings” was high. Correlations of ratings with the AGCT and educa- 
tion is low, consistent with other studies. Alternatives were then analyzed 
as to correlations with high or low criterion groups of officers. Various 
methods of scoring and the effects of various cutting scores were analyzed. 
Development of an objective Officer Evaluation Report (OER) was 
begun with appraisal of current Army efficiency rating methods. The 
War Department AGO Form 67, Officer Efficiency Report, and the 
AAF Form No. 123, Officer Evaluation Report were subjected to in- 
tensive analysis including intercorrelations of sections and trait ratings 
and factor analysis (301, 302). The technic of collecting statements 
from officers and enlisted men concerning characteristics of good and 
poor officers and refining and scaling these statements was _investi- 
gated (300). Also available were the findings on investigations of the 
“forced-choice” technic (295, 296). Discriminating power of every item 
in nine different types of rating scales was determined. The /ntervieu 
is a standardized, objective procedure which breaks sharply with tradition. 
It is intended specifically to evaluate ability to deal with people. Board 
members observe behavior and record observations, then check descrip- 
tions, integrate these into ratings on specific areas of behavior, and finally 
evaluate candidate’s ability to deal with people. Objectivity was achieved 
by defining overt behavior that could be observed and judged during the 
interview and developing conversational situations designed to elicit this 
type of behavior (299). 

Validity and Reliability of the Entire Battery. Two purposes were in- 
tended: (a) to select officers who were outstanding in past and present 
performance of duty, and (b) to assure the ability of such officers to remain 
outstanding in the future. For achievement of the first aim, scores on the 


Officer Evaluation Report (OER) and the Biographical Information Blank: 
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(BIB) for 3000 officers and on the /nterview (INT) for 1359 officers were 
validated against job performance as evaluated by a large number of 
fellow officers. Approximately 13,000 officers were studied for development 
of criterion groups. Three groups, high, middle, and low, of about 1000 
men each were used, consisting of men clearly and consistently placed 
these categories by fellow officers in battalions or similar groups and by 
commanding officers. To achieve the second purpose, scores of 3000 officers 
on the General Survey Test (GST) were compared with educational level 
attained and scores for 367 officers on the Officer Classification Test (OCT ) 
were compared with scores on the AGCT. A combined point-index based 
on the OER, the BIB, and the /NT adequately differentiated officers on 
the basis of efficiency and did so in a manner far superior to the traditional 
Army board proceedings. Percent of most competent and least competent 
officers chosen by various cutting scores was determined. Mean score on 
the GST showed a high re lationship to educational level achieved and 
showed high variability at each educational level. All instruments and the 
criterion were determined to have suitable reliability (298). 

New Officer Candidate Program. Instruments devised and validated 
the integration program for officers were adapted for selecting candidates 
for Officer Candidate Schools among enlisted applicants of the Signal 
Corps (303, 304) on the basis of leadership. An interview procedure, a 
biographical information blank, a military report, and a recommendation 
blank were validated against pooled buddy ratings and platoon officer 
ratings at various stages of training. The AGCT and OCT were found to 
be satisfactory predictors of academic success. This work was expanded 
to include the development of officer candidate selection instruments on 
an Army-wide basis. Forms used in the Signal Corps study were revised 
(312) after analysis. 

Integration of Nurses into the Regular Army. Items for a biographical 
information blank (315) and an evaluation report (316) were secured 
from an analysis (285) of essays on good and poor nurses and from officer 
characteristics evaluated previously (296). 

Officer Efficiency Rating Methods. A thoro research program on officer 
efficiency reporting methods grew out of the investigation of the usefulness 
of the semiannual officer Efficiency Report, WD AGO Form 67, as a selec- 
tion device for the retention of wartime officers in the regular Army (298). 
Five methods were evaluated: the currently used WD AGO Form 67 (301) 
and AAF Form 123 (302); a forced ranking form FR-2 (272); a report 
(OER-A) using the rating checklist technic (273); and a report (FCL-2) 
using the forced-choice technic (274, 295, 296, 297, 308). These were 
validated against four separate criteria: (a) Position in criterion groups 
of high, middle, and low officers as rated by groups of fellow officers (295, 
297); (b) a criterion index score of from 0 to 60 based on these nomina- 
tions (266, 298); (c) an over-all rating on a 20-point scale in comparison 
with Army officers in general; and (d) comparison with officers of the 
same grade. Results showed a clear superiority of the FCL over the other 
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four instruments (265, 266, 267, 268). In consequence, a revised form. 
FCL-3, was tried with corroborative results (269). Later studies found 
that validity was increased by combining the rating checklist of the 
OER-A with the FCL-3 (309), tho the FCL alone is superior to the R(/ 
alone; that forced ranking as used in FR-2 increased validity slightly when 
incorporated with other forms, but had low validity alone; and that indors, 
ment of ratings improved validity slightly (271), but later training had 
little effect (270). 


VI. Tabulating and Analysis Technics 


Test Validity 


A formula was developed for estimating the reduction in size of corre. 
lation coefficient when mean scores are inserted for “no data” cases (234). 
as well as formulas (207) for estimating change in r and other statistical 
constants due to selection on a single variable, either predictor or criterion 
An empirical study of effects on obtained correlation of restriction i) 
range led to results fairly comparable with predictions on basis of Kelley's 
formula (95). A method for estimating the probability of obtaining a 
score at or above the mean on the criterion for any given score on the 
predictor variable (28) was further developed (144, 145) by a method 
for estimating the probability that an individual with any given score on 
the predictor will fall at or above any given critical score on the criterion. 
The original method was extended to make it applicable to evaluation of 
significance of differences in test means for two samples (111). Two 
methods were presented for estimating test efficiency (166). Anothe: 
approach is given in Richardson’s formula (206) for interpreting a test 
validity coefficient in terms of increased efficiency of a selected group o! 
personnel. A method was proposed for estimating the size of the samp| 
required for test standardization (62). 

IBM Equipment. Maximum utilization of IBM equipment and elimina- 
tion of errors introduced by inaccurate usage were of some concern. An 
extra circuit added to the test scoring machine will give certain derived 
scores directly by shifting the zero point (43). Errors result from the use 
of Government Printing Office Answer Sheets when the test scoring machine 
is set for IBM Answer Sheets (101). Favorable conclusions were reached 
concerning possibility of utilizing No. 1 pencils instead of IBM pencils 
in marking answer sheets (135). Detailed steps necessary in checking the 
adjustment of the IBM Test Scoring Machine were reported (245). A 
tabulation was also made (45) of differences in scores between machine- 
scored and hand-scored answer sheets. 

Test Selection. A procedure was developed for estimating the proportion 
of the variance of the total scores on a test contributed by each of the 
parts (130). The effect of a suppressor variable upon Wherry test selection 
results was also discussed (243). 
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Item Analysis. Tetrachorics were found more reliable than item-test 
correlations obtained with use of Richardson’s Nomograph (80). Errors 
in the use of Richardson’s Nomograph to analyze items not attempted by 
all subjects were pointed out (44). The effect of guessing on biserial 
correlations obtained between items and true scores was found to vary 
with item difficulty (244). Maximum r obtainable with Adkins-Toops 
Quintile Formula was found to vary with difficulty of item (238). 

Test Reliability. Kuder-Richardson Formula No. 21 appears to under- 
estimate reliability of scores based on the average of two administrations. 
A new formula was suggested (133). Insofar as assumptions underlying 
Kuder-Richardson Formula No. 21 are met, the addition of zero scores 
will increase the magnitude of the N obtained for all r’s less than unity 
(81). Kuder-Richardson Formula No. 20 appears to overestimate the 
reliability of a test when the distribution of item difficulty is highly 
skewed (200). A technic was given (200) for computing practice effect, 
difference in difficulty of parallel forms, and difference in level of ability 
for two groups taking two forms of a test. 

Computing and Facilitating Tables, Nomographs and Work Sheets or 
lob Descriptions. The following devices for facilitating computations were 
suggested: (a) job description of Wherry-Doolittle test selection method 
(159); (b) job description and work sheet for computing Pearson r by 
“difference” (diagonal) method (187); (c) work sheet for applying 
Adkins-Toops simplified formulas for item-selection (238); (d) job 
description and work sheet for factor analysis involving thirty-five or fewer 
variables (124); (e) work sheet for correcting correlations for restriction 
on one variable (253); (f) item analysis against median split on total 
test score (Richardson’s Nomograph) (80); (g) expectancy figures based 
on validity coefficients for various Army tests (232); (h) table for chang- 
ing ranks in groups smaller than 100 to equivalent rank in a group of 
100 (193); (i) values of ZX, =X*, VXY, VY, and LY? for values of N 


oo 


from one to twenty for each cell of a 13 x 13 scatterplot (188); (j) four- 


q Sis 
place table of | -for three-place values of p or q (141); (k) facilitating 
zZ 


tables for obtaining standard AGCT scores from number of attempted items 
and number of right answers (33); (1) probable error of median for 
certain values of Q and N (28); (m) value of J-r?, \/1-r’, and 
V 1-r? 
. l 
for various values of r (2); and (n) value of  - and 
values of r (11). V 1-r? \V/ 1-r? 
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Mechanical Apprehension Test Scores with Airplane Mechanics Course Grade: 
May 1941. 


. Personnet Researcnu Section (Starr), Apyutanr GeNerRAL’s Orrice, Wa! 


DeparTMENT. PRS Report No. 121. A Study of Mechanical and [ntelligenc 
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Tests as Possible Predictors of Success in Motor Mechanics and Communi- 
cation Courses at Fort Sill. May 1941. 

59. PERSONNEL Researcnu Section (Starr), ApJuTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 125. Procedure Used in Scaling the Non-Lan- 
guage Test 2abc. April 1941. 

. PERSONNEL ResEARCH Section (Starr), ADJUTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 143. The Selection of Radio Operators and Me- 
chanical Students. March 1942. 

. PERSONNEL Researcu Section (Starr), Apyutant GeNeRAL’s Orrice, War 
DeparTMENT. PRS Report No. 149. Standardization of Classification Test R-1. 
June 1941. 

2. PERSONNEL Researcu Section (Starr), ApyuTaAnt GENERAL’s OFrFiIce, War 
DEPARTMENT. PRS Report No. 151. Empirical Check on Sampling Effects and 
Size of Required Sample. October 1941. 

. PERSONNEL Researcu Section (Starr), ApJuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 159. Tabulation of Clerical Aptitude-1 Army 
Grade Distributions from Field Returns to August 1, 1941. October 1941. 

. PersonNEL Researcn Section (Starr), ApyutaAnt GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 163. Reliability of the Road Test. October 1941. 

5. PERSONNEL Researcu Section (STAFF), ADJUTANT GENERAL’s Orrice, War 
DeparRTMENT. PRS Report No. 166. Reliability of the Code Learning Test and 
Relation to the Radiotelegraph Operator Aptitude Test. November 1941. 

. PERSONNEL Researcu Section (Starr), ApJUuTANT GENERAL’s Orrice, War 
DeparTMENT. PRS Report No. 167. Comparison of Mechanical Aptitude-] Scores 
and Success in Signal Corps Post School, Fort Monmouth. November 1941. 

. PersONNEL Researcn Section (Starr), ApyuTANnt GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 168. Item Analysis of Educational Achievement 
Tests, EA-1, X-1, Maxwell Field. September 1941. 

. PERSONNEL Researcn Section (Starr), ApyurTANnt GENERAL’s Orrice, War 
DeparTMENT. PRS Report No. 170. Validity of Classification Tests (the AGCT 
Non-Language Test, NL-2abc, Mechanical Aptitude, MA-l, Clerical Aptitude, 
CA-1) for Engineer's Training Course. October 1941. 

. PERSONNEL Researcu Section (Starr), Apyutant GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 171. Study of the CA-1, MA-1, and AGCT Score 
Distributions of Selectees at Camp Croft, South Carolina. August 1941. 

. PERSONNEL RESEARCH Section (Starr), ApJUTANT GENERAL’s Orrice, War 
DeparRTMENT. PRS Report No. 175. Comparison of Scores of “Mechanics” and 
“Non-Mechanics” at Camp Lee, Virginia, on Forms A and B of the Surface 
Development Test (Experimental MA-2, -3, S.D.), Mechanical Comprehension 
Test (Experimental MA-2, MC), and the AGCT. November 1941. 

. PersonNEL Researcu Section (Starr), ApyutTaAnt GeNERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 176. Reliability of Psycho-Physical Tests Used 
at Camp Lee, Virginia. October 1941. 

. Personnet Researcn Section (Starr), Apyutant GENERAL’s Orrice, War 
DeparTMENT. PRS Report No. 178. Summary of Fort Knox Driver Study. 
November 1941. 

. PersoNNEL Researcn Section (Starr), ApyutTaAnt GENERAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 181. Studies on Prediction of Achievement by 
Prospective Bombardiers. June-December 1941. 

. PersoNNEL Researcn Section (Starr), ApyJuTANT GENERAL’s Orrice, WAR 
DepaRTMENT. PRS Report No. 182. Scaling of Higher Examinations, H-1 and 
H-2. November 1941. 

75. PERSONNEL Researcn Section (Starr), Apyutanr GenerAw’s Orrice, War 
DeparTMENT. PRS Report No. 193. Reliability of AGCT-la by Test-Retest 
Method. November 1941. (Supplement to the above, January 1942.) 

. PERSONNEL Researcn Section (Starr), Apyutant GENERAL’s Orrice, War 
DepartTMENT. PRS Report No. 195. Validation of Night Vision Test. Deceraber 
1941. 

. PersonneL Researcnu Section (Starr), ApyuTaAnt GENERAL’s OFFice, War 
DEPARTMENT. PRS Report No. 196. The Relation of MA-1, AGCT-la, and Edu- 
cation to Auto Mechanics Final Grades at Fort Knox, Kentucky. December 
1941. 
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78. PersoNNEL Researcn Section (Starr), ADJUTANT GENERAL’s OFFICE. Wa 
DEPARTMENT. PRS Report No. 206. Night Vision and Its Relation to Rac: 
Blood Sugar. 

79. PERSONNEL RESEARCH Section (STAFF), ADJUTANT GENERAL’s OFFICE. Wap 
DEPARTMENT. PRS Report No. 208. Prediction of Final Course Grade fro; 
Higher Examinations, H-l, and Army Officer Training Examinations. Decen 
ber 1941. 

. PERSONNEL ResEARCH Section (STAFF), ADJUTANT GENERAL’s OFFice. Wa 
DeparTMENT. PRS Report No. 211. Computation of Tetrachoric Correlations 
by Chesire-Saffer-Thurstone and Richardson Charts. 

. PERSONNEL Researcu Section (Starr), ADJUTANT GENERAL’s OFFICE, Wap 
DepaRTMENT. PRS Report No. 215. The Effect on the Reliability Coe ffici; 
of Adding Zero Scores to the Distribution of Scores. December 1941. 

2. PERSONNEL ResEARCH Section (STAFF), ADJUTANT GENERAL’s OFFICE, Wap 
DepaRTMENT. PRS Report No. 217. Differences in Test and Retest Scores 0: 
Experimental MA-2 (Mechanical Comprehension, Mechanical Informatio: 
and Surface Development) after (9) Weeks Training at Enlisted Men’s Sch 
Fort Belvoir, Virginia. December 1941. 

3. PERSONNEL Researcu Section (Starr), ApyutAnt GENERAL’s Office, Wa: 
DEPARTMENT. PRS Report No. 219. Relationship of Years of Education « 
Signal Corps Code Aptitude Test Scores to Final Course Grades. October 194) 

. PersONNEL Researcw Section (Starr), ApJUTANT GENERAL’s OFrFice, War 
DepaRTMENT. PRS Report No. 220. Prediction of Code Speed from AGC1 
MA-1, CA-l, and ROA Tests. December 1941. 

5. PERSONNEL ReseEARCH Section (Starr), ADJUTANT GENERAL’s OFFice, Wa) 
DeparTMENT. PRS Report No. 222. Internal Evidences of Relative Diff 
of Higher Examinations, H-1 and H-2. July 1941. 

. PERSONNEL Researcn Section (Starr), ApyuTaANnr GENERAL’s Orfrice, War 
DEPARTMENT. PRS Report No. 223. Reliability of Camp Lee Road Test. Decem 
ber 1941. 

. PERSONNEL ReseARCH Section (Starr), ADJUTANT GENERAL’s OrfFrice, War 
DEPARTMENT. PRS Report 225. Summary of Academic and Aptitude Test 
Results for Bombardiers and Navigators at Maxwell and Ellington Fiel 
December 1941. 

. PersONNEL Researcw Section (Starr), ApyuTANt GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 228. The Reliability of Higher Examinations, H-| 
and H-2. January 1942. 

. PERSONNEL ReseArcH Section (Starr), ApyuTANT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 229. Relation of MA-1, AGCT, and Educat 
with Final Grades at the Tank Mechanics Course, Fort Knox, Kentucky. Jan- 
uary 1942. 

. PERSONNEL ResearcH Section (Starr), ADJUTANT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 234. Report on the Standardization of MA.2 
and MA-3. January 1942. 

. PersoNNEL Researcnu Section (Starr), ApJuTANT GENERAL’s OrrFice, War 
DEPARTMENT. PRS Report No. 235. Selection of Radiotelegraph Operators 
January 1942. 

. PERSONNEL Researcw Section (Starr), ApyutTaAnt GeENERAL’s Orrice, War 
DeparRTMENT. PRS Report No. 236. Reports: On the Value of the Code Apiti- 
tude Test and the Army General Classification Test for Predicting Success at 
Radio School; Relationship of Code Aptitude Test Scores to Musical Ability 
and Army General Classification Test Scores. December 1941. 

. PerSONNEL Researcn Section (Starr), ApyuTaAnr GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 237. Final Report on the Scoring and Reporting 
of Results on the Air Corps Achievement Examinations Given November 12, 
1941. January 1942. 

. PERSONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, War 
DepaRTMENT. PRS Report No. 242. Prediction of Final Grades of Graduates 
of the Clerical Course, Fort Knox, Kentucky, from AGCT, CA-l, and MA-! 
Scores. January 1942. 

. PersonneL Researce# Section (Starr), ApyuTaAnt GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 244. The Effect of Restricted Ranges of Ability 
on Correlations Between AGCT and the Three Forms of the Mechanical Apti- 
tude Test. January 1942. 
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. PERSONNEL ResEARCH Section (Starr), ApJUTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 245. Tables of Scores of Commissioned Officers 
on the AGCT, Clerical Aptitude Test, and Mechanical Aptitude Test. January 

1942. 

. PERSONNEL RESEARCH Section (Starr), ADJUTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 246. Grades of Motor Mechanics as Related to 
Part and Total Scores on MA-1 and AGCT, Camp Lee, Virginia. January 1942. 

. PERSONNEL RESEARCH SecTION (STAFF), ADJUTANT GENERAL’S OFFICE, WAR 
DEPARTMENT. PRS Report No. 252. Analysis of Soldier Performance Report 
Data. February 1942. 

. PERSONNEL RESEARCH SecTIOn (STAFF), ADJUTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 253. Selection of Officer Candidates: Relation 
of AGCT, Education, and Other Variables to Success in Officer Candidate 
School. February 1942. 

. PERSONNEL RESEARCH Section (Starr), ApyJuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 255. Summary of Construction of Electricity and 
Radio Information TK-1, X-2. February 1942. 

. PERSONNEL ReseARCH Section (Starr), ADJUTANT GENERAL’s OFFice, WAR 
DEPARTMENT. PRS Report No. 258. A Comparison of the Amount of Tolerance 
for Misplaced Answers Found in the GPO and IBM Machine-Scored Answer 
Sheets. February 1942. 

. PERSONNEL ResFARCH Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DepaRTMENT. PRS Report No. 266. Standardization of the Radiotelegraph Oper- 
ator Aptitude Test, ROA-1, X-1. May-November 1942. 

. PersonNEL Researcnw Section (Starr), ApyutTaAnt GENERAL’s Orrice, WAR 
DePpaRTMENT. PRS Report No. 267. Tables of Equivalents for Otis, Army Alpha, 
and AGCT Scores. 

. PERSONNEL Researcn Section (SrarF), ApDJUTANT GENERAL’s OrFrice, WAR 
DEPARTMENT. PRS Report No. 268. Notes on the Preparation of Conversion 
Tables from Army Alpha Raw Scores to Corresponding General Classification 
Test-la Standard Scores. December 1940. 

. PERSONNEL ResearcnH Section (Starr), ApJuTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 269. Study of Tests for the Determination of 
Code Aptitude. 

. PERSONNEL Researcu Section (Starr), ApJuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 271. Selection of Items from the Army General 
Classification Test, AGCT-1b, for Classification Test, R-2. February 1942. 

. PersoNNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 275. The Prediction of Machine Shop Perform- 
ance, Air Corps Technical School, Chanute Field, Illinois. March 1942. 

. PERSONNEL Researcu Section (Starr), ApJuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 277. Prediction of Grades in Gunnery School 
from MA-l and AGCT. March 1942. 

. PersONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 278. Analysis of CA-2 Data Obtained at the 
Clerical Section of the Armored Force School, Fort Knox, Kentucky. March 
1942. 

. PersonNEL Researcn Section (Starr), ApyuTaAnrt GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 283. Item Analysis: Automotive Information 
Test, TK-1, GAI-1, X-l. May 1942. 

. PersONNEL Researcu Section (Starr), ApgutTaANt GenerAw’s Orrice, War 
DEPARTMENT. PRS Report No. 284. The Evaluation of Differences Between the 
Test Means for Two Sample Populations. March 1942. 

2. PersoONNEL Researcw Section (Starr), ApyJuTANT GENERAL’s Orrice, WAR 
DepaRTMENT. PRS Report No. 286. Analysis of Wechsler Sel/-Administering 
Test Data. March 1942. 

3. PersonNEL Researcu Section (Starr), Apyutanr GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 287. Prediction of Auto Mechanics Final Grades 
from AGCT-la and MA-I Scores at Fort Knox, Kentucky. March 1942. 

. Personne Researcnu Section (Starr), ApsuTANT GENERAL’s OFFICE, WAR 
DeparTMENT. PRS Report No. 291. Estimation of the Effect of Omitting Block 
Counting from the Army General Classification Test. March 1942. 
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115. Personnec Researcn Section (SrarF), Apyutant GENeERAL’s Orrice, Wap 


DeparRTMENT. PRS Report No. 292. Analysis of Block Counting Items of 1 
AGCT. March 1942. 


116. PERSONNEL Researcu Section (Starr), ApyuTaANt GENERAL’s Ofrrice. 


DEPARTMENT. PRS Report No. 301. Item Analysis of Arithmetic Test, 
X-l. Selection of Wrong Alternatives. April 1942. 


117. PERSONNEL Research Section (Starr), ApyutTaANtT GENERAL’s Orrice, | 


118. 


DEPARTMENT. PRS Report No. 307. Interpretations of AGCT Test Score: 
Negro and White Selectees. April 1942. 

PERSONNEL Researcu Section (Starr), ApJUuTANT GENERAL’s Orrice, 
DepartTMENT. PRS Report No. 308. Summary of Status of OCT-1, X-1, Stan 
ardization. July 1942. 


. PERSONNEL Researcu Section (Srarr), ApyuTaANt GENERAL’s Orrice, W 


DEPARTMENT. PRS Report No. 309. Standardization of Classification Test R-: 
April 1942. 


. PERSONNEL Researcu Section (Srarr), ApyutaNt GeENERAL’s OFFice, War 


DEPARTMENT. PRS Report No. 311. Driver Experience Inventory. August 1942 


. PeRsONNEL ResearcH Section (Starr), ApJUTANT GENERAL’s OrFice, War 


DEPARTMENT. PRS Report No. 312. Night Vision of Colored and White Soldiers 
April 1942. 


. PersoNNEL Researcu Section (Starr), Apyutant GENERAL’s Orrice, Wa) 


DepaRTMENT. PRS Report No. 313. Analysis of Test Scores of Apprent 
Mechanics Motor Training Section, Quartermaster Replacement Trair 
Center, Camp Lee, Virginia. October 1942. 


3. PersoNNEL Researcn Section (Starr), ADJUTANT GENERAL’s OFFIce, War 


DEPARTMENT. PRS Report No. 314. Reaction Time and Accuracy Tests Used ai 
Camp Holabird. April 1942. 


. PersONNEL Researcw Section (SrarF), ApJuTANT GENERAL’s OrFice, Wa) 


DepARTMENT. PRS Report No. 319. Procedure for Factor Analysis of Studies 
Involving Thirty-Five or Fewer Variables. May 1942. 


25. PersonNeEL Researcn Section (Srarr), ApyutaNnt GeENERAL’s Orrice, Wa’ 


DEPARTMENT. PRS Report No. 324. Grades in a Motor Mechanics Course « 
Related to Vocational Training, Civilian Occupation, and Test Scores on M A-2, 
MA-3, Enlisted Men and Officers, Fort Benning. May 1942. 


26. PeRsONNEL Researcu Section (Starr), ApyuraANnt GeENERAL’s Orrice, War 


DEPARTMENT. PRS Report No. 325. Report on Analysis of Fort Knox Repeat 
Driver Tests, March 1942; Improvement on Road Test vs. Fort Knox Dr 
Tests. May 1942. 


. PersonNEL Researcu Section (Starr), Apyurant GeENERAL’s Orrice, Wa! 


DeparTMENT. PRS Report No. 326. Report on Standardization of WCT-1, X-2 
May 1942. 


. PersonNeL Researcu Section (Starr), ApyutTaANt GENERAL’s Orrice, War 


DEPARTMENT. PRS Report No. 328. Study of Some Factors in Radio Operator 
Selection, Scott Field, Illinois. May 1942. 


. PERSONNEL Researcu Section (Starr), ApJUTANT GENERAL’s Orrice, War 


DEPARTMENT. PRS Report No. 330. Grades in Maneuvers Course and W inch 
Mechanics Course at the Balloon Barrage Course, Camp Tyson, Tennessee, 
as Related to Each Other to Score on AGCT, Mechanical Aptitude, MA-1, and 
Clerical Aptitude, CA-1. June 1942. 


. Personne. Researcnu Section (Starr), Apyutant Generaw’s Orrice, War 


DeparTMENT. PRS Report No. 331. A Procedure for Estimating the Proportion 
of the Total Scores on a Test Contributed by Each of the Parts of the Test. 
June 1942. 


. Personnet Researca Section (Starr), ApJuTANT GENERAL’s Orrice, WAR 


DEPARTMENT. PRS Report No. 334. Reliability of Fort Belvoir Night Vision 
Tests, June 1942; Hopkins Night Vision Test (Day to Day Reliability). July 
1942. 


. Personne. Researcn Section (Srarr), Apyutant GeNeERAL’s Orrice, War 


133. 
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DeparTMENT. PRS Report No. 338. Success in Officer Candidate Courses Re- 
lated to AGCT Scores and Other Variables. July 1942. 

PersoNNEL Researcn Section (Starr), Apyutant GeNnerAL’s Orrice, War 
DeparTMENT. PRS Report No. 339. Computation of Test Score Reliabilities 
May 1942. 
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. PersONNEL Researcu Section (Starr), ApyutaAnt GENERAL’s OFFice, War 
DEPARTMENT. PRS Report No. 340. Accident Record vs. Psychological Tests. 
July 1942. 

35. PERSONNEL Researcn Section (Starr), ApyuTaAnt GENERAL’s OrFice, WAR 
DEPARTMENT. PRS Report No. 344. The Effect of the Use of No. 1 Pencils on 
the Accuracy of Scoring IBM Answer Sheets by Machine. July 1942. 

. PERSONNEL ResearcH Section (Starr), ApyutTaNt GENERAL’s OFrFIce, WAR 
DEPARTMENT. PRS Report No. 347. Reliability of Personnel Form P-1. August 
1942. 

37, PERSONNEL Researcn Section (Starr), ApJuTANT GENERAL’s OrFice, WAR 
DEPARTMENT. PRS Report No. 350. Analysis of Visual Classification Test, VC-1, 
X-1 Data from Camp Croft. July 1942. 

. PERSONNEL Researcw Section (Starr), ApyJuTANT GENERAL’s OrFrice, WAR 
DEPARTMENT. PRS Report No. 354. Standardization of the Visual Classification 
Test, VC-1, X-2, August 1942; Supplement: A Standard Score Scale for the 
Visual Classification Test, VC-1, X-2. September 1942. 

. PersONNEL ReseEARCH Section (Starr), ApyuTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 356. Relation of Failure to Army General Classi- 
fication Test, Fort Monmouth, New Jersey. 

. PERSONNEL RESEARCH Section (STAFF), ADJUTANT GENERAL’s OrFrice, WAR 
DerpaRTMENT. PRS Report No. 358. Validation of Tests for Selection of Radio 
Operators, ROA-1, X-1; CLT-2, X-3; and Substitution Test. August 1942. 

. PersonNNEL Researcn Section (Starr), ApJuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 360. Four-Place Tab'e of pq/z for Three-Place 
Values of p or q. 

2. PERSONNEL Researcn Section (Starr), ApJuTANT GENERAL’s OFFICE, WAR 
DeparTMENT. PRS Report No. 363. Tables for Use in Converting Scores on AG 
Tests to Those on Coast Artillery Entrance Examinations. August 1942. 

. PersoNNEL Researcn Section (Starr), ApJUTANT GENERAL’s OrFice, WAR 
DeparTMENT. PRS Report No. 371. Analysis of Attempts on Each Type of AGCT 
Item by Grade V Men in Regular and Special Training. September 1942. 

. PERSONNEL Researcu Section (Starr), ApJUTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 375. The Computation of Expectancy Tables. 
June 1942. 

. PeRSONNEL Researcu Section (Starr), ApyuTaANnt GENERAL’s OFFIce, WAR 
DEPARTMENT. PRS Report No. 375a. Interpretation of Correlation Coefficients 
in Terms of Expected Performance in One of the Associated Variables, August 
1945. 

. PersonNeL Researcnu Section (Starr), ApyjuTant GENERAL’s OFFICE, WAR 
DepartTMENT. PRS Report No. 378. Reliability of Radiotelegraph Operators 
Aptitude Test, ROA-1, X-1. September 1942. 

. PersonNeEL Researcu Section (Starr), ApyuTANt GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 379. Driver Experience Inventory #2 (Camp 
Pickett Validation Data). October 1942. 

. PersonNNEL ResEARCH Section (Starr), ApJUTANT GENERAL’s OrFice, WAR 
DepaRTMENT. PRS Report No. 380. Validity of Officer Candidate Test, OCT-1, 
X-1. October 1942. 

. PersonNEL Researcnu Section (Starr), ApyuTant GeENeERAL’s Orrice, War 
DepaRTMENT. PRS Report No. 381. Personnel Form P-1 (Also called Shipley 
Personality Inventory and the Personnel Form R-2). April 1942. 

. Personnet Researcnu Section (Starr), ApyutANt GENERAL’s Orrice, War 
DeparTMENT. PRS Report No. 382. Analysis of Mental Alertness-1, X-2 Test 
Results for Female Students at Mount Vernon Seminary, Woodrow Wilson 
High School, Trinity College, and Catholic University. October 1942. 

51. PersonnEL Research Section (Starr), ApyutaANt GENERAL’s OrFice, WAR 
DepartTMENT. PRS Report No. 385. Standardization of the General Proficiency 
Test, WCT-1, X-3. October 1942. 

. Personne Researcu Section (Starr), ApyuTaANt GENERAL’s Orrice, War 
DepartTMENT. PRS Report No. 386. Test Scores of Accident vs. Non-accident 
Drivers. October 1942. 

. Personnet Researcn Section (Starr), Apyutant Generat’s Orrice, WAR 
Pn PRS Report No. 389. Evaluation of War Orientation Test-1, X-1. 

une d 
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54. PERSONNEL Researcu Section (Starr), Apyurant GENERAL’s OFFICE, Wap 
DEPARTMENT. PRS Report No. 392. Standardization of WAAC Specialist Tests 
November 1942. 

. Personne, Researcnu Section (Starr), Apyutant GenerAt’s OFFICE, Wap 
DepaRTMENT. PRS Report No. 393. Analysis of Responses to Each Alternatiy, 
of Each Item in the General Proficiency Test, WCT-1, X-3. November 1942. 

. PERSONNEL Researcu Section (STAFF), ApJUTANT GENERAL’s OFFICE, Wap 
DEPARTMENT. PRS Report No. 394. Driving Performance vs. Experience and 
Test Scores (Fort Knox Data). November 1942. 

. PersONNEL ResearcH Section (Starr), ApJuTANT GENERAL’S OFFICE. Wap 
DePpaRTMENT. PRS Report No. 401. Analysis of Responses to Each Alternativ 
Made by Men Tested at Induction Stations: Visual Classification Test, VC.-|. 
X-2. November 1942. 

. PersonNeEL Researcw Section (Starr), Apyutant GENERAL’s OFFICE, Wap 
DEPARTMENT. PRS Report No. 402. Analysis of Responses to Each Alternativ 
of Each Item for Grade V Men in Special Training Units and in Regular 
Training Units. November 1942. 

. PERSONNEL ReseaRcH Section (StrArF), ADJUTANT GENERAL’s OFFICE, War 
DepaRTMENT. PRS Report No. 403. The Wherry-Doolittle Test Selection Method 

. PERSONNEL ResearcH Section (Starr), ApJUTANT GENERAL’s OFFIce, War 
DepARTMENT. PRS Report No. 406. Analysis of Responses to Each Alternativ 
of Each Item. December 1942. 

. PERSONNEL Researcu Section (Starr), ApyutTant GENERAL’s OFFice, War 
DEPARTMENT. PRS Report No. 410. Development of Improved Radio Cod 
Aptitude Tests. March 1942. 

. PERSONNEL ReseEarcH Section (Starr), ApJuTANT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 411. Selection of Men for Training as Med 
Technicians. Evaluations of Procedures Used at Camp Lee and/or Cam; 
Pickett, Virginia; Billings General Hospital, Fort Harrison, Indiana; Walter 
Reed Hospital, Washington, D. C. January 1943. 

. PERSONNEL Researcu Section (Starr), ApyutTant GeENERAL’s Orrice, War 
DeparTMENT. PRS Report No. 412. Relation Between Original Tests (MA, CA, 
and AGCT) Given at Reception Centers and Retests Given at the Armoré 
Force School, Fort Knox. December 1942. 

. PersoONNEL ResearcH Section (Starr), ApyuTANT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 418. Selection of Officer Candidates: Validity 
of Officer Candidate Test, OCT-1, X-l, for Predicting Academic Success of 
West Point 1942 Class. January 1943. 

5. PersonNNEL Researcu’ Section (Starr), ApyutaANt GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 420. ACE Psychological Examination (1942 ed 
Raw Scores Equivalent to AGCT Standard Scores. August 1943. 

. PERSONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 421. Methods for Estimating Test Efficiency 
August 1943. 

. PersoNNEL Researcw Section (Starr), ApyutTaANnt GEeNERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 422. Officer Candidate Test, OCT-1, X-1, Iten 
Analysis Based on Samples of Fort Belvoir and Camp Lee Officer Candidates. 
March 1943. 

. PERSONNEL Researcn Section (Starr), ApyuTANtT GENERAL’s Orrice, War 
DepaRTMENT. PRS Report No. 424. Officer Candidate Tests OCT-2, X-1, an 
X-2, Item Analysis Based on Camp Lee Officer Candidates and Compilation 
of OCT-1 and OCT-2. March 1943. 

. PersonneL Researcu Section (Starr), Apyutant GeENERAL’s Orrice, War 
DepARTMENT. PRS Report No. 425. Item Analysis of Automotive Information 
Test, TK-1, X-2, Fort Meade, Maryland. June 1942. 

. PersonneL Researcn Section (Starr), ApyutTant GeNeRAL’s Orrice, Wark 
DEPARTMENT. PRS Report No. 427. Arithmetic Test, EA-3, X-2. Item Analysis 
Based on Sample of W AAC Auxiliaries. March 1943. 

. PersonNEL Researcu Section (Starr), ApJUTANT GENERAL’s OFFice, WAR 
DeparTMENT. PRS Report No. 428. Validity of Military Police Test Batter) 
for Predicting Course Grades at Provost Marshal General's School, Fort Ogle- 
thorpe, Georgia, March 1943; Standardization of Reading and Reporting-!, 
X-1, for Military Police Officer Candidates. May 1943. 
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172. PERSONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 430. Selection of Truck Drivers: Driver Experi- 
ence Inventory #2, Driver Information Test #9, and other Measures, Camp 
Lee, Virginia. March 1943. 

3. PERSONNEL ResearcnH Section (Starr), ApjJuTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 432. General Classification Test, GCT-Ic or Id. 
Test-Retest Differences for Enlisted Men Who Score in Grade V on Original 
Test. April 1943. 

. PERSONNEL RESEARCH SeEcTION (StarF), ApDJUTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 433. A Comparison of the AGO Experimental 
Battery, W PQ-1, X-1 and the West Point Qualifying Examinations for Predic- 
tion of First Term Academic Performance of Fourth Classmen Entering July 
1942. April 1943. 

PERSONNEL Researcu Section (Starr), ApyuTANt GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 437. Selection of Officer Candidates, Standardi- 
zation and Validation of OCT-1 and OCT-2 at Fort Benning and Fort Monmouth. 
August 1943. 

. PERSONNEL Researcu Section (Starr), ApsuTANT GENERAL’S OFFice, WAR 
DEPARTMENT. PRS Report No. 439. Norms for Driver Information Tests DIT-9 
and DIT-10. January 1943. 

. PERSONNEL Researcu Section (Starr), ApjJuTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 444. Selection of Leaders, Status of the Measure- 
ment of Leadership. April 1943. 

. PersonNEL Researcn Section (Starr,;, ApyuTaAnt GEN&RAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 446. Selection of Officer Candidates, Validation 
Study of Leadership Test L-1, X-1, at Fort Belvoir and Fort Benning. June 1943. 

. PERSONNEL ResearcH Section (Starr), ApyJuTANT GENERAL’s OFFIce, War 
DePpARTMENT. PRS Report No. 447. Standardization of Clerical Aptitude Test, 
CA-2, X-2 for War Department Civilian Personnel. November 1942. 

. PERSONNEL ReEsEARCH Section (Starr), ADJUTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 449. Mechanical Aptitude Test MA-4, X-1: Item 
Analysis Based on Sample of WAAC Auxiliary. July 1943. 

. PERSONNEL Researcu Section (Starr), ApJuTANT GENERAL’s OFFice, WAR 
DepaRTMENT. PRS Report No. 450. The Validation of an Experimental Battery 
of Combat Intelligence Tests at Camp Blanding, Florida. June 1943. 

2. PERSONNEL ResearcH Section (Starr), ApJUTANT GENERAL’S OrFice, WAR 
DEPARTMENT. PRS Report No. 451. Comparison of Scoring Formulas “Rights 
and Rights Minus 1/3 of the Wrongs” Based on the Results of West Point 
Cadets on AGCT-ld, Elementary Math-l, X-l1, and Language Aptitude-l, X-1. 
July 1943. 

3. PERSONNEL Researcw Section (Starr), ApyuTANT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 452. Item Analysis: General Automotive Infor- 
mation Test, TK-7, X-1, Normoyle Ordnance Motor Depot, San Antonio, Texas. 
September 1943. 

4. PersonNEL Researcn Section (Starr), Apyutant GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 453. Comparison of Performance of Women 
(WAAC’s) with That of Men: Radiotelegraph Operator Aptitude Test, ROA-1, 
X-l. July 1943. 

5. PERSONNEL Researcu Section (Starr), ApyutTaANnt GeENERAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 457. Clerical Aptitude Test CA-2, X-2: Item 
Analysis Based on War Department Civilian Employees. August 1943. 

. PERSONNEL Researcn Section (Starr), ApyJuTANT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 459. Validation of Tests of Selection of W AAC 
Trainees for Basic and Specialist Schools. August 1943. 

. PersonneL Researcn Section (Starr), ApJuTANT GENERAL’s OFFice, WAR 
DEPARTMENT. PRS Report No. 462. Procedure for Computation of the Pearson 
Product Moment Coefficient of Correlation Using Special Computation Chart. 
October 1943. 

. PERSONNEL Researcnu Section (Starr), Apyutant GeNERAL’s OFrFice, WAR 
DeparTMENT. PRS Report No. 465. Values of =X, =X*, SXY, SY, and SY’, 
when N varies from 1 to 20 for each Cell of a 13 x 13 Scatterplot. 

. PersonNEL Researcu Section (Starr), ApyutaAnrt GENERAL’s OFrice, War 
DepartTMENT. PRS Report No. 466. Development of Basic Classification Battery. 
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The Influence of General Information in the Reading and Vocabulary 7, 
September 1943. 


. PERSONNEL ReseEaRCH Section (Starr), ApDJUTANT GENERAL’S OrfFice. \ 


DePaRTMENT. PRS Report No. 468. Selection of West Point Cadets. March 19 


. PERSONNEL RESEARCH SeEcTION (Starr), ADJUTANT GENERAL'S Orfrice. \ 


DEPARTMENT. PRS Report No. 469. Validation of Women’s Classification 7 
WCT-2, as a Predictor of Success in WAC Officer Candidate Schools, | 
Oglethorpe. March 1944. 


2. PERSONNEL Researcu Section (Starr), ADJUTANT GENERAL’S Office, 


DeparTMENT. PRS Report No. 470. Standardization of Reception Center S 
Training Unit Tests, Fort Ontario. November 1943. 


. PersoNNEL Researcn Section (Starr), ApyutTant GENERAL’s Office, \\ 


DepaRTMENT. PRS Report No. 474. Rating Procedures for Measuring 
formance. Paired Comparison and Rank in 100. November 1943. 


4. PERSONNEL ResearcH Section (Starr), ApjuTANtT GENERAL’s Orrice, \\ 


DEPARTMENT. PRS Report No. 475. Interim Report on AAF Ground Crew ( 
sification Test. August 1943. 


. PersonNEL Researcu Section (Starr), ApJuTANT GENERAL’s Orrice, W 


DeparTMENT. PRS Report No. 476. Validation of Radiotelegraph Operat 
Aptitude Test, ROA-1, X-l, and the Code Learning Test, CLT-2, X-3, |! 
Knox. November 1943. 


. PersoNNEL Researca Section (Starr), ApyutTant GENERAL’s Orrice, \ 


DerpaRTMENT. PRS Report No. 477. Statistical Summary on the Aptitud: 
Studies at the Second Air Force Intelligence School, 18th Replacement J] 
Salt Lake City, Utah. December 1943. 


. Personnec Researcn Section (Starr), ApyJuTaANt GENERAL’s Office, \ 


DeparTMENT. PRS Report No. 483. Relationship of Classification Test R-] 
WAC Classification Test, W CT-2, for a Recruiting Station Population. Ja 
1944. 


. PersonNneL Researcn Section (Starr), Apyutant GENERAL’s Orrice, \ 


DEPARTMENT. PRS Report No. 484. Validity of the Officer Candidate T« 
Predicting Academic Success at the Tank Destroyer and Transportation ( 
Officer Candidate Schools. January 1944. 


. PersonneL Researcu Section (Starr), ApyutANnt GENERAL’s Orrice, W 


DEPARTMENT. PRS Report No. 485. Relationship of WAC Classification 1 
WCT-2 to Army General Classification Test for WAC Applicants. January 


. PERSONNEL Researcn Section (Starr), ApJuTANT GENERAL’s OFFICE, 


DepARTMENT. PRS Report No. 486. Technique for the Comparison o/ 
Groups on Two Forms of a Test. January 1942. 


. PERSONNEL Researcn Section (Starr), ApyuTANt GENERAL’s OFFICE, 


DeparTMENT. PRS Report No. 488. Validation of AAFTTC and AGO Apt 
Tests. October 1942. 


. PersONNEL Researcn Section (Starr), ApyJuTANT GeENERAL’s Orrice, W 


DEPARTMENT. PRS Report No. 493. Construction and Standardizatio: 
Women’s Classification Test, WCT-2, to Replace WCT-1 for WAC Recru 
September 1944, 


. PERSONNEL Researcu Section (Starr), ApJuTANT GENERAL’s Orrice, W 


DEPARTMENT. PRS Report No. 499. The Use of Age-Grade Placement and ( 
Success Data in Predicting Scores on the Soldier Performance Report. 
1942. 


. Personnec Researcu Section (Starr), ApyuTaNt GENERAL’s Orrice, W4 


DepaRTMENT. PRS Report No. 500. The Validity of Preference Inventory (I! 
X-1) for Prediction of Leadership Ratings at the Infantry and Engineer O 
Candidate Schools. March 1944. 


. PersonneL Researcu Section (Starr), ApyutTANnt GENERAL’s OFFICE, 


DEPARTMENT. PRS Report No. 501. Report on the Development of Mac! 
Scores Code Aptitude Test, ROA-2, X-1. July 1943. 


. PersonneL Researcw Section (Starr), ApyuTANT GENERAL’s OFFICE, 
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DerpaRTMENT. PRS Report No. 502. Article: Interpretation of a Test Va! 
Coefficient in Terms of Increased Efficiency of a Selected Group of Perso 
by M. W. Richardson. April 1944. 

PERSONNEL Researcn Section (Starr), ApyuTaANt GENERAL’s OFFICE, 





December 1948 


PERSONNEL RESEARCH PROGRAM OF THE ARMY 


DEPARTMENT. PRS Report No. 504. Article: Estimation of the Change in Certain 
Statistical Constants Due to Selection on a Single Given Variable. April 1944. 

. PeRSONNEL Researcu Section (Starr), ApyuTaANt GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 506. Analysis of a Rating Scale for the De- 
termination of Marginally Satisfactory and Unsatisfactory Soldiers, Fort Mc- 
Clellan. May 1943. 

. PERSONNEL Researcu Section (Starr), ApJUTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 510. Validation of Induction Station Tests-l, 
Fort Belvoir. March 1943. 

. PERSONNEL Researcn Section (Starr), ApyutTaANt GENERAL’s OFFiIce, WAR 
DePpARTMENT. PRS Report No. 511. Validation of Induction Station Tests II, 
A Preliminary Study at Camp Pickett Medical Replacement Training Center. 
March 1943. 

. PERSONNEL Researcu Section (Starr), ApyuTaAnt GENERAL’s OFFice, WAR 
DEPARTMENT. PRS Report No. 512. Validation of Induction Station Tests III, 
Fort McClellan. May 1943. 

. PERSONNEL Researcn Section (Starr), ApysuTANt GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 514. The Selection of Inductees at Induction 
Stations—The Comparability of Qualification Test, Q-1, and Qualification Test, 
Q-2, First, Fourth, and Fifth Service Commands. October 1943. 

. PersonNeL Researcn Section (Starr), ApJuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 515. A Follow-Up of the Induction Station Test 
Validation Study at Fort McClellan IRTC. April 1944. 

. PERSONNEL Researcn SictTion (Starr), ApyutaAnt GENERAL’s OrFice, War 
DEPARTMENT. PRS Report No. 516. The Validation of Induction Station Test 
V: The Relationship between the Scores on the Experimental Induction Station 
Tests and Success in Reception Center Special Training Unit at Fort Leaven- 
worth, Fort Benning, Camp Robinson, and Fort Sam Houston. April 1944. 

. PERSONNEL ResearcH Section (Starr), ADJUTANT GENERAL’s OFFice, WAR 
DEPARTMENT. PRS Report No. 517. Standardization of Group Target Test, 
GT-1, Individual Examination, IE-1, Group Orientation, GO-1, Individual Target, 
IT-1, Visual Classification, VC-la, and Non-Language Individual Examination, 
NIE-1. April 1944. 

. PeRSONNEL Researcn Section (Starr), ApyutTant GENERAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 518. Validation of Induction Station Tests, Six 
Supplements. May 1944, 

. PersonNEL Researcn Section (Starr), ApyurTaAnt GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 519. Differential Patterns of Item Attempts on 
the Army General Classification Test Exhibited by Grade IV and V Men Tested 
at the Reception Center, Fort Leavenworth, Kansas, 1944. April 1944. 

. PERSONNEL Researcn Section (Starr), ApyutTant GeNERAL’s OFFice, WAR 
DEPARTMENT. PRS Report No. 521. The Validity of the Wechsler Mental 
Ability Scale as a Predictor of Soldier Performance Ratings of STU Trainees. 
April 1944. 

. PERSONNEL Researcn Section (Starr), ApyurTANt GeNERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 522. West Point Selection Examination for Pre- 
diction of First Term Academic Performance of 1943 Fourth Classmen. June 
1944. 

. PERSONNEL Researcu Section (Starr), Apyurant GeNERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 527. Standardization of the West Point Qualify- 
ing Examination, WPQ-1, for the 1944 Fourth Class. June 1944. 

. PeRsoNNEL Researcu Section (Starr), ApyutaAnt GENERAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 528. AlT-Further Validation of the Shoulder 
Patch Test Executed at the ERTC, Fort Belvoir, Virginia. June 1944. 

. Personnet Researcn Section (Starr), ApsutaNnt GeNERAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 529. AIT Validation Study at the QMRTC, Camp 
Lee, Virginia. June 1944. 

. PersoNNEL Researcn Section (Starr), ApyutTant GeNeRAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 530. Report on Radio Code Aptitude Tests. De- 
cember 1944. 

. Personne Researcn Section (Starr), ApyutraANnt GeneraAv’s Orrice, WAR 
DeparTMENT. PRS Report No. 530. Report on Radio Code Aptitude Tests: Part 
I, Validation; Part II, Standardization. May 1945. 


607 








REVIEW OF EDUCATIONAL RESEARCH Vol. XVIII, No. ¢ 





25. PERSONNEL ResearcnH Section (Starr), ApyuTANT GENERAL’s Orrice. VV 
DEPARTMENT. PRS Report No. 532. Tank Destroyer School, Camp H 
Experiment in Combat Adaptability. December 1943. 

PERSONNEL ResearcH Section (Starr), ApyuTANT GENERAL’s OrFice, Wa 
DepaRTMENT. PRS Report No. 543. Current Status and Recommendations R, 
lating to Tests for Classification of Aircraft Warning Trainees. February 1944 

7. PERSONNEL Researcu Section (Starr), ApDJUTANT GENERAL’s Office, WV 
DEPARTMENT. PRS Report No. 545. Performance and Written Tests and Perso, 
Data Factors as Predictors of Grades of Enlisted Air Crew Radio Mec’ 
at Scott Field. November 1944. 

. PERSONNEL RESEARCH Section (STAFF), ADJUTANT GENERAL’S OFFiIce. W: 
DepARTMENT. PRS Report No. 546. Validation of Practical Performance a 
Other Technical Tests at Keesler Field, Airplane Mechanics. October 1944 

. PersONNEL ReseARCH Section (STAFF), ADJUTANT GENERAL’s Office, Wa) 
DEPARTMENT. PRS Report No. 550. The Relative Validities of Performar 
Aptitude and Written Tests for the Prediction of Success in Aircrajt Armorer 
School at Buckley and Lowry Field, Colorado. August 1944. 

. PERSONNEL Researcn Section (STAFF), ApJUTANT GENERAL’S OFFice, Wa; 
DepaRTMENT. PRS Report No. 551. Standardization of the Army Indi 
Test (AIT-1) Camp Barkeley, Texas, May 1944. August 1944. 

PERSONNEL ResearcuH Section (Starr), ApJUTANT GENERAL’S OFFICE, Wa: 
DeparRTMENT. PRS Report No. 553. Development of the Weather Aptit 
Test, TC-3a, for Predicting Academic Success at Weather Observer Sch 
August 1944. 

2. PersONNEL ResearcuH Section (Starr), ApyuTANT GENERAL’s OrfFice, W: 
DEPARTMENT. PRS Report No. 562. Re-evaluation of Expectancy Tabi 
Easily Understood Terms Which Are Comparable from One Test to An 
September 1944. 

PERSONNEL RESEARCH Section (StarF), ApJUTANT GENERAL’s Orrice, Wai 
DEPARTMENT. PRS Report No. 563. Analysis of Data on Mental, Mechar 
Clerical, Motor, and Visual Tests from Philadelphia Quartermaster De 
September 1944. 

PERSONNEL Researcnu Section (Starr), ApJuTaANt GENERAL’s Orrice, W 
DepaRTMENT. PRS Report No. 564. Estimating the Effect on Correlat 
Inserting Mean Scores for “No Data” Cases. September 1944. 

PERSONNEL RESEARCH SeEcTION (STAFF), ADJUTANT GENERAL’s OFFice, Wa! 
DepaRTMENT. PRS Report No. 567. Selection of Alternatives for Arithmé 
Reasoning Test, Experimental Forms 1, 2, 3, and 4, for the AGCT-3. October 
1944. 

. PERSONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, Wat 
DepaARTMENT. PRS Report No. 568. Development of AGCT-3 and Informat 
Tests. August 1945. 

7. Personnet Researcn Section (Starr), ApJuTANT GENERAL'S OrrFice, Wa! 
DepARTMENT. PRS Report No. 590. Summary Report on Warrant Officer Es 
aminations. October 1944. 

PERSONNEL ResearcH Section (Starr), ApyuTANT GENERAL’s Orrice, Wa! 
DeparTMENT. PRS Report No. 596. Procedures for Applying the Adkins-T oo 
Simplified Formulae for Item Selection. October 1944. 

PERSONNEL Researcn Section (Starr), ApJUTANT GENERAL’s OFFICE, WAR 
DepARTMENT. PRS Report No. 597. Validity of AGO Tests as Predictors 0! 
Success in Rock Island Armament Maintenance School, and Rock Islar 
Arsenal Sub-Office at Dunwoody Institute. November 1944. 

PERSONNEL ResearcH Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 599. Validity of Learning Ability, OG-056a and 
Clerical Aptitude CA-3, Part A in Certain Sections of the Casualty Branch, 
AGO. October 1944. 

PERSONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, Wa! 
DEPARTMENT. PRS Report No. 603. Standardization and Item Analysis of Niné 
Verbal Tests. December 1944, 

PERSONNEL ResearcH Section (Starr), ADJUTANT GENERAL’s OrrFice, WA 
DEPARTMENT. PRS Report No. 610. Analysis of Procedure and Rejection 
All Induction Stations Operating During a Six Day Period in June 1944. 
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3. PERSONNEL ResearcH Section (Starr), Apyutant GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 611. Test Selection and Suppressor Variables. 
January 1945. 

. PERSONNEL Researcu Section (Starr), ApsuTANtT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 612. The Effect of Guessing on the Biserial Cor- 
relation between Two Category Test Items and “True” Scores. March 1945. 

. PERSONNEL Researcu Section (Starr), Apyutanr Generaw’s Orrice, War 
DEPARTMENT. PRS Report No. 613. Checking the Adjustment of IBM Test 
Scoring Machines. 

. PERSONNEL ResearcH Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 617. Validation of Testing Battery Suitable for 
Use in the Selection of Under-Engineer Trainee of the Training Section, Signal 
Corps, War Department. March 1945. 

. PERSONNEL RESEARCH Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 621. Standardization of the General Clerical 
Abilities Test, CC-105a: Part 1-New York Port of Embarkation. April 1944. 

. PERSONNEL Researcu Section (Starr), ApyuraANnt GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 622. Check on the Standardization of Army 
Radio Code Aptitude Test 1944 (ARC-1). August 1945. 

. PERSONNEL Researcu Section (Starr), ApsutraNnt GeENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 628. Study of the Difficulties of the Warrant 
Officer General Educational Test. November 1942. 

250. PERSONNEL Researcu Section (Starr), ApyutaANt GeENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 633. A Preliminary Determination of Item 
Difficulty and Validity for STU Placement and Achievement Tests in Reading 
and Arithmetic. June 1945. 

. PERSONNEL ResearcH Section (Starr), ADJUTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 634. Selection of Content for Final Forms of 
Achievement and Placement Tests for Reading and Arithmetic Courses. 

252. PERSONNEL ResearcH Section (Starr), ApJUTANT GENERAL’s OrFice, WAR 
DEPARTMENT. PRS Report No. 636. Standardization of Shop Mechanics Tests, 
SM-1 and SM-2, and Automotive Information Tests, AIT-1 and AIT-2. June 1945. 

53. PERSONNEL Researcu Section (Starr), ApyuTANtT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 638. Standard Operational Procedure for Cor- 
recting rs between Two Variables for Restriction on Third. 

54. PersoNNEL Researcu Section (Starr), ADJUTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 640. Construction of Experimental and Final 
Forms of Achievement and Placement Tests for Reading and Arithmetic 
Courses in Reception Center Training Units: Construction of Standard Score 
Scales. June 1945. 

. PERSONNEL Researcn Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 641. Development and Validation of the Army 
Automotive Screening Tests for Use in Ordnance Schools. December 1944. 

. PeRSONNEL ResearcH Section (Starr), ApJUTANT GENERAL’s OFrFice, WAR 
DEPARTMENT. PRS Report No. 64la. Follow-up Study of the Validity of the 
Army Automotive Screening Tests for Use in Ordnance Schools. January 1946. 

. PeRSONNEL Researcu Section (Starr), ApJuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 644. Interpretation of Army Test Data for 
Civilian Educational and Occupational Guidance: Relation of Army General 
Classification Test to American Council on Education Psychological Examina- 
tion for College Freshmen (ACE) 1942 edition. August 1945. 

. PersonNEL Researcu Section (Starr), ApyuTant GeNERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 646. Completion of General Classification Test-la 
(Spanish Version). August 1945. 

. PERSONNEL ResearcH Section (Starr), ApyutTaAnt GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 647. Validity of General Clerical Abilities Test, 
CC-105a, and of Learning Ability Test, OG-056a, for Clerical Jobs at Head- 
quarters, Sixth Service Command, Chicago, Illinois. September 1945. 

. PeRsoNNEL Researcu Section (Starr), ApyuTant GeENERAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 648. Validation of Mechanical Knowledge Parts 
I and II, Paper Form Board, and Learning Ability Tests. September 1945. 

. PersonneL Researcn Section (Starr), Apyutant GeNeRAL’s Orrice, WAR 
DeparTMENT. PRS Report No. 652. Construction of, Clerical Aptitude Test 
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CA-3, Part A—Speed; Part B—Fundamentals; and Part C—English for Use j; 
the Placement of Civilian Personnel of War Department Installations. Se 
tember 1945. , 

. PersonneL Researcu Section (Starr), ApyutaANt GENERAL’s Office, W4; 
DeparRTMENT. PRS Report No. 655. Construction of General Clerical Abili; 
bn CC-105a, for Measuring Aptitudes of Civilian Clerical Workers. Novembe; 

5. 

. PersonNeEL Researcu Section (Starr), ADJUTANT GENERAL’s OFFICE. Wa 
DEPARTMENT. PRS Report No. 663. Report on Use of the General Mecha; 
Aptitudes Test, CM-142a, for the Selection of Trainees for the Rock | 
Armament Maintenance School Courses. September 1945. 

. Personnet Researcu Section (Starr), ApyJuTANT GENERAL’s Officer. Wa, 
DEPARTMENT. PRS Report No. 667. Construction of General Mechanical Ap: 
tudes Test for Use with Technical and Mechanical Employees (Ci: 
October 1945. 

. PersonNeL Researcn Section (Starr), ApyuTaANnt GENERAL’s Orrice, VW 
DEPARTMENT. PRS Report No. 670. Major Study of Comparative Validity , 
Five Periodic Officer Efficiency Reporting Methods: 1. Zone of Interior. Dece 
ber 1945. 

. PersONNEL Researcu Section (Starr), Apyutant GeNERAL’s Orrice, Wa; 
DEPARTMENT. PRS Report No. 671. Comparative Validity of the WD AGO For 
67 and the FCL-2 According to Various Breakdowns: 1. Zone of Interior. Dec: 
ber 1945. 

. PersonNeL Researcn Section (Starr), ADJUTANT GENERAL’s Orrice, Wa) 
DepARTMENT. PRS Report No. 672. Major Study of Comparative Validity o; 
Five Periodic Efficiency Reporting Methods: II. European Theater. December 
1945. 

. PersonNeL Researcn Section (Starr), ApyutTaAnt GENeERAL’s Orrice, Wa) 
DEPARTMENT. PRS Report No. 673. Comparative Validity of the WD A( 
Form 67 and the FCL-2 According to Various Breakdowns: II. Europe 
Theater. December 1945. 

. PERSONNEL Researcn Section (Starr), Apyurant GENERAL’s Orrice, Wa) 
DEPARTMENT. PRS Report No. 674. A Field Study of the Effectiveness o/ F(! 
3a, A Self-Training, Indorsed Efficiency Report. November 1945. 

. Personne. Researcw Section (Srarr), ApyuTant GeENERAL’s Orrice, Wa 
DEPARTMENT. PRS Report No. 675. The Relationship Between Main Ci: 
Occupation and Other Variables. Part I—Preliminary Study Based on Machin: 
Record Survey #2, November 1945. Part Il—Relation Between Main ( 
Occupation and Army General Classification Test Standard Score, March 194 
Part III—Effect of Rater Training on WD AGO Form 67. January 1946. 

. Personne Researcu Section (Starr), Apyutant Generaw’s Orrice, Wa: 
DeparRTMENT. PRS Report No. 676. The Effect of Indorsement on the Vc 
of Efficiency Report WD AGO Form 67. December 1945. 

. PERSONNEL Researcn Section (Starr), ApyJuTANT GENERAL’s Office, Wa) 
DEPARTMENT. PRS Report No. 677. Experimental Evidence of the Value of Ran! 
ing as a Method of Rating. December 1945. 

3. Personne. Researcu Section (Starr), ApyutaAnt GENeERAL’s Orrice, Wa! 
DeparTMENT. PRS Report No. 678. Construction and Scoring of the Officer 
Efficiency Report OER-A. October 1945. 

74. PeRsONNEL Researcu Section (Starr), Apyutant GENERAL’s Orrice, War 
DepaRTMENT. PRS Report No. 679. Construction and Scoring of the Officer 
Efficiency Reports, FCL-2a, b, c. October 1945. 

75. Personne, Researcn Section (Starr), ApyutaNnt GENERAL’s Orrice, War 
DeparTMENT. PRS Report No. 681. Construction, Validation, and Standardiza- 
tion of a Battery of Tests for the Army Finance School, Duke University, Nort! 
Carolina. May and June 1944. 

. PERSONNEL ResearcH Section (Starr), Apyurant GeENERAL’s Orrice, Wa! 
DeparTMENT. PRS Report No. 682. Development of Tests for Terminat 
Accountants and Auditors. May and June 1944. 

. Personne Researcn Section (Starr), ApyuTaANt GENERAL’s Orrice, Wa! 
DePpARTMENT. PRS Report No. 683. Validity of AGCT-3a Total and Part Scores 
in Predicting Success in Army Technical Training Courses. May 1946. 
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278. PersSONNEL Researcn Section (Starr), Apyutant Generaw’s Orrice, War 
DEPARTMENT. PRS Report No. 684. Standardization of AGCT-3b Total and 
Part Scores. May 1946. 

279. PERSONNEL Researcn Section (Starr), Apyutant Generaw’s Orrice, War 
DEPARTMENT. PRS Report No. 685. Analysis of Military Knowledge Test, TC- 
101x. March 1946. 

280. PERSONNEL Researcn Section (Starr), ApyutaNt GENERAL’s OFFIce, WAR 
DEPARTMENT. PRS Report No. 686. Validation of Forms 3 and 4 of Electrical 
Information Test and Forms 3 and 4 of the Radio Information Test among 
Trainees at the Radio Repair Course, CSCS, Camp Crowder, Missouri. July 1946. 

. PeRSONNEL ResearcH Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 687. Validity of Radio Information Test, Forms 
I and 2, in Predicting Success among Trainees in the Radio Repair Course 
and in the Communications Course at the Tank Destroyer Training Scheol, 
Camp Hood, Texas. July 1946. 

. PERSONNEL Researcn Section (Starr), ApyuTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 688. Validation and Item Selection for the Elec- 
tricity and Radio Information Test at Truax Field, Wisconsin. July 1946. 

. PERSONNEL Researcu Section (Starr), ApJuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 689. Administration of Electrical and Radio 
Information Test to Reception Center Populations. May 1946. 

. PERSONNEL ResearcH Section (Starr), ApyutTaANt GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 690. Validation of Four Experimental Forms of 
the Electrical Information Test at the New York Trade School and the New 
York Television Institute. July 1946. 

. PersonneL Researcn Section (Starr), ApyutANt GENERAL’s OFFice, War 
DepartTMENT. PRS Report No. 691. Characteristics of Good and Poor Army 
Nurses Compiled from Essays Written by Medical Officers, Supervisory Nurses, 
General Duty Nurses and Patients. May 1946. 

. PersonNEL Researcn Section (Starr), ApyutaANt GeENERAL’s OFFice, War 
DEPARTMENT. PRS Report No. 692. Development and Use of Army Trade 
Screening Tests in ASF, March 1946. July 1946. 

. Personne Researcu Section (Starr), ApyutANt GeENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 692a. Use of Army Trade Screening Tests to 
Evaluate the Effectiveness of Training in ASF Training Centers. Supplement I. 
March 1946. 

. PERSONNEL Researcu Section (Starr), ApyutaAnt GENERAL’s OFFICE, WAR 
DepARTMENT. PRS Report No. 692b. Use of Army Trade Screening Tests to 
Evaluate the Effectiveness of Training in ASF Training Centers. Supplement 
II. March 1946. 

. PerSONNEL Researcu Section (Starr), Apyutant GENeERAL’s OrFice, WAR 
DEPARTMENT. PRS Report No. 693. Development of Instruments for Selection 
of Enlisted Personnel for Recruiting Work. July 1946. 

. PersonNNEL Researcn Section (Starr), ApyuTANtT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 694. Performance and Written Tests and Per- 
sonal Data Factors as Predictors of Supervisory Ratings of Competence of 
Specialists in AAF Fighter Combat Units in Continental U. S. September 1943. 

. Personnet Researcu Section (Starr), Apyutant GeENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 694a. Supplement to Report on Performance and 
Written Tests and Personal Data Factors as Predictors of Supervisory Ratings 
of Competence of Specialists in AAF Fighter Combat Units in Continental 
United States. August 1946. 

. Personnet Researcu Section (Starr), ApyutrAnt GenerAv’s Orrice, War 
DEPARTMENT. PRS Report No. 695. Correlational Analysis of Sixteen Tests 
(Arlington Hall Factor Analysis Study). July 1945. 

. Personne Researcu Section (Starr), ApyutaAnt GeENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 697. Validation of General Clerical Abilities 
Test, CC-105a, and Certain Other Tests of Clerical Aptitude. February 1946. 

. Personne Researcn Section (Starr), ApyutANt GENERAL’s Orrice, WAR 
DepartTMENT. PRS Report No. 700. Jtem Analysis of the Multiphasic Personality 
Inventory, Based on Data Collected at Camp Stewart under Project PR-4030. 
June 1945, 


611 











REVIEW OF EDUCATIONAL RESEARCH Vol. XVIII, No. 6 


295. PERSONNEL ResearcH Section (Starr), ADJUTANT GENERAL’s OFFICE, Wap 
DePpARTMENT. PRS Report No. 701. Methodological Investigation of the Forced 
Choice Technique, Utilizing the Officer Description and Officer Evaluat 
Blanks. July 1945. 

. PERSONNEL Researcu Section (Starr), ApyJuTANT GENERAL’s OFFICE, Wa 
DEPARTMENT. PRS Report No. 702. Obtaining Officer Preference and Officer 
Characteristics Scale Values of Adjective for Use in Construction of Items 
the Biographical Information Blank, PR-4061-02. July 1945. 

. PERSONNEL RESEARCH Section (Starr), ApJUTANT GENERAL’s Orrice, W, 
DEPARTMENT. PRS Report No. 703. Construction and Selection of Items 
the Biographical Information Blank (BIB). July 1945. 

. PERSONNEL ReseEaRCH Section (STAFF), ADJUTANT GENERAL’s Office. Wa) 
DEPARTMENT. PRS Report No. 704. Validation of a Program for Selection 
Officers for Retention in the Peacetime Army. July 1945. 

. PERSONNEL Researcnu Section (Starr), ADJUTANT GENERAL’s OrFice, War 
DEPARTMENT. PRS Report No. 705. Development of an Interview Procedure {o1 
Use in the Officer Selection Procedures, PR-4061-09 and 4061-10. July 1945. 

. PERSONNEL ResearcuH Section (StTAFF), ApJUTANT GENERAL’S OrFrice, War 
DepARTMENT. PRS Report No. 706. Characteristics of Successful and Unsuc- 
cessful Officers Studied for the Development of Officer Evaluation and Report 
ing Forms, PR-4061-08. August 1945. 

. PersonNEL ResearcH Section (Starr), ApJUTANT GENERAL’s Orrice, Wai 
DepaRTMENT. PRS Report No. 707. Analysis of Rating Made with the WD AGO 
Form 67, Efficiency Report. July 1945. 

. PersoNNEL ResearcH Section (Starr), ApyJuTANT GENERAL’s Orrice, Wa) 
DEPARTMENT. PRS Report No. 708. Analysis of Ratings of Air Force Off 
on AAF Form No. 123, Officer Evaluation Report. July 1945. 

. PersoNNEL ReseARCH Section (Starr), ApJUTANT GENERAL’s Orrice, Wa 
DEPARTMENT. PRS Report No. 711. Predictions of Leadership Qualif 
of Officer Candidates in the Signal Corps, PR-4071b. March 1946. 

. PersONNEL RESEARCH SeEcTION (STAFF), ADJUTANT GENERAL’S Office, War 
DEPARTMENT. PRS Report No. 7lla. Prediction of Tactical Performance of Of 
cer Candidates in Signal Corps, Supplement to Report and Recommendat 
PR-4071b. March 1946. 

. PersoNNEL Researcu Section (Starr), ApJuTANT GENERAL’s Orrice, Wat 
DEPARTMENT. PRS Report No. 712. Officer Retention Project Equivalent Scal: 
for the Two Forms of Officer Classification Test, OCT-14A and OCT-14B. 
June 1945, PR-4061. June 1945. 

. PERSONNEL ReseEARCH Section (STAFF), ApyuTANT GENERAL’S Orrice, War 
DepaRTMENT. PRS Report No. 713. Development of the General Survey Test, 
Camp Blanding, PR-4061. May 1945. 

. PersonNEL Researcn Section (Starr), ApyutANt GeNeRAL’s Orrice, War 
DepaRTMENT. PRS Report No. 714. Validation of Officer Classification Test, 
OCT-14, as a Predictor of Grades at the Command and General Staff School, 
Fort Leavenworth, Kansas. 

. PersonNNEL Researcw Section (Starr), ApyutaAnt Generaw’s Orrice, Wa! 
DEPARTMENT. PRS Report No. 715. Possibility of Predicting Proper Classifica 
tion of Officer on Basis of Differential Scoring of FCL-2a Items, Part II (Most 
Least), PR-4073. April 1946. 

. PERSONNEL ResearcH Section (Starr), ApyutaANt GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 717. Comparison of Rating Check List (RCI 
and Forced Choice List (FCL) Methods of Obtaining Ratings, September 194), 
PR-4073. 

. PersONNEL Researcu Section (Starr), ApyuTANtT GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 718. The Development and Evaluation of Classif 
cation Tests R-3 and R-4. June 1946. 

. PersonNneL Researcn Section (Starr), ApyutTaAnt GENERAL’s Orrice, WAR 
DepaRTMENT. PRS Report No. 722. Data Concerning Possible Cut-Off Scores 
on the General Survey Test for the 2nd Officer Integration Program, PR-4096 
July 1946. 

312. PersonneL Researcn Section (Starr), ApyuTaANnt GENERAL’s Orrice, WAR 
DepaRTMENT. PRS Report No. 723. Development of Predictor Instruments Used 
in Study of Selection of Candidates for Officer Training, PR-4076. August 1946. 
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313. PERSONNEL RESEARCH Section (Starr), ApyuTANT GENERAL’s OFrFice, WAR 
DEPARTMENT. PRS Report No. 724. The Relationship of Army Individual Test 
Subscores and Other Mental Ability Tests to Diagnosis of Mental Disorder. 
June 1945. 

. PersONNEL Researcu Section (Starr), ApyutaANnt GeENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 726. Analysis of Item Responses of W AC's and 
WAC Applicants on the Multiphasic Personality Inventory (TC-8a) and the 
Cornell Selectee Index Administered at Grand Central Palace, New York City. 
May and June 1944. 

. PersoONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 727. Construction of Biographical Information 
Blanks NSB-1 and NSB-2 for Nurses and Women Medical Specialists. Septem- 
ber 1944. 

. PERSONNEL Researcu Section (Starr), ApyuTANT GENERAL’s OFrFice, WAR 
DEPARTMENT. PRS Report No. 728. Construction of Army Nurse Evaluation 
Report Form NSE-1B and Army Nurse Evaluation Report Supplement Form 
NSE-IBs. October 1946. 

. PERSONNEL ReseArcH Section (Starr), ApyuTANT GENERAL’s OFFICE, WAR 
DEPARTMENT. PRS Report No. 801. Validation of the General Clerical Abilities 
Test, CC-105a, as a Selection Instrument for the Position of File Clerk, CAF-2, 
Decorations and Awards Branch, AGO. March 1946. 

. PeRSONNEL Researcn Section (Starr), ADJUTANT GENERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 1000. The Determination of a Qualifying Score 
on Army Specialized Training Test, OCT-2, X-3 for Selection of Men for the 
Army Specialized Training Program. January 1943. 

. PERSONNEL Researcu Section (Starr), ApyutaAnt GeNeERAL’s Orrice, WAR 
DEPARTMENT. PRS Report No. 1001. Selection of Officer Candidates: Validity 
of Officer Candidate Test OCT-2, X-3 for Predicting Academic Averages of the 
West Point 1943 Fourth Class. March 1943. 

. PERSONNEL Researcu Section (Starr), ADJUTANT GENERAL’s OFFice, WAR 
DepARTMENT. PRS Report No. 1004. Standardization of United States Army 
and Navy Test C-1 for Civilian Candidates for the Army Specialized Training 
Program. April 1943. 

21. PersonNeEL Researcu Section (Starr), ApyutTaANt GeENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 1009. Prediction of Success in the First Term 
Basic Engineering Curriculum at Syracuse University. September 1943. 

. PERSONNEL ReseaRCH Section (Starr), ApjyuTant GENERAL’s Orrice, War 
DEPARTMENT. PRS Report No. 1020a. AST Achievement Test Report: Decem- 
ber 1943, Standardization Testing. February 1944. 

. PeRSONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Orrice, WAR 
DepARTMENT. PRS Report No. 1025. Standardization of Army-Navy Qualifying 
Test C-2 Administered. November 1943. 

24. PERSONNEL Researcu Section (Starr), ApJuTANT GENERAL’s OrFice, WAR 
DEPARTMENT. PRS Report No. 1026. Prediction of Success in the ASTP Basic 
Engineering-1, Term 1 Curriculum at City College of New York. June 1944. 

25. PERSONNEL ResearcuH Section (Starr), ApJuTANT GENERAL’s OrFice, WAR 
DEPARTMENT. PRS Report No. 1027. Relation of the Aptitude Test for the 
Medical Professions, Form 20, First Edition, to the Army General Classification 
Test. February 1944. 

26. PERSONNEL Researcnu Section (Starr), ApJUTANT GENERAL’s OrFice, War 
DepaRTMENT. PRS Report No. 1027a. Relation of the Aptitude Test for the 
Medical Professions, Form 20, First Edition, to the Army General Classifica- 
tion Test, for Candidates (a) Preferring Pre-Medical Training, (b) Preferring 
Pre-Dental Training and (c) Not Interested in Pre-Medical or Pre-Dental 
Training. 

. PersonNEL Researcu Section (Starr), ApyutaANt GeNeRAL’s Orrice, War 
DepaRTMENT. PRS Report No. 1028. Item Analysis of January 1944 ASTP 
Experimental Tests. February 1944. 

. Personne. Researcw Section (Starr), Apyutant Generat’s Orrice, War 
DEPARTMENT. PRS Report No. 1030. Results of the Administration of the Apti- 
tude Test (Form 20) for the Medical Professions to the ASTP Trainees. March 
1944. 


613 











REvIEW OF EpuCATIONAL RESEARCH Vol. XVIII, No. | 





329. PersONNEL Researcn Section (Starr), ApyuTANt GENERAL’s Orricr. 
DePpaRTMENT. PRS Report No. 1031. Results of the Administration of the Ap; 
tude Test (Form 21) for the Medical Professions to ASTP Trainees. March, 1944 

. PersoNNEL Researcw Section (Starr), ADJUTANT GENERAL’s OFFice. Wy 
DEPARTMENT. PRS Report No. 1032. Some Preliminary Investigation into 1), 
Relationships of the Aptitude Test for the Medical Professions to other | 
cational and Personal Variables. March 1944. 

. PERSONNEL Researcu Section (Starr), ApyuTANT GENERAL’s Office. 
DEPARTMENT. PRS Report No. 1034. Standardization of Army-Navy ( 
Qualifying Test C-3. April 1944. 

32. PersoNNEL Researcu Section (Starr), ADJUTANT GENERAL’s OFrice. 
DeparTMENT. PRS Report No. 1036. Comparison of AST Achievement Tes: 
Results in the December 1943 Standardization with Results in the January ]944 
Population. April 1944. 

3. PersonneL Researcu Section (Starr), Apyutant GENERAL’s OrFice, Wa, 
DeparRTMENT. PRS Report No. 1037. The Effect of Scoring Formula Upon th, 
Reliability of AST Achievement Tests. April 1944. 

. PeRsONNEL Researcn Section (Starr), ADJUTANT GENERAL’s Orrice, 
DEPARTMENT. PRS Report No. 1041. The Relation between Specialized Traini 
Test. 

5. PERSONNEL ResEArcH Section (Starr), ApJUuTANT GENERAL’s Orricr, Wa) 
DEPARTMENT. PRS Report No. 1049. Results of Experimental Study of FE fect 
of Directions Against Guessing and of Corrections for Guessing on Scores 
ASTP Contract Tests, State University of lowa. 

. PersonNEL Researcn Section (Starr), ADJUTANT GENERAL’s OrFice, W.) 
DEPARTMENT. PRS Report No. 1050. Comparison of Prediction of Succ: 
Terms I and II ASTP Basic Engineering Curriculum at Syracuse Universit 
June 1944, 

. PersoNNEL ResearcH Section (Starr), ADJUTANT GENERAL’s Office, Wa) 
DepartMENT. PRS Report No. 1051. The Relationship Between Formal Jt: 
Analysis and Reliability on AST Achievement Tests (October 1943 Res 
October 1943 Experimental, and December 1943 Standardization Tests). May 
1944. 

. Personne. Researcn Section (Starr), ApyutaANt GENERAL’s Orrice, Wa; 
DepartTMENT. PRS Report No. 1052. The Validity of the AGCT, American C 
cil on Education Psychological Examination (1942 edition), Army Specializ 
Training Test OCT-2, X-3, and the WPQ-1, X-1 Language Aptitude Test a: 
Predictors of Success in the ASTP Language Curricula at the College of th 
City of New York, Syracuse University, Boston University, and Michigan State 
College. June 1944. 

. PersonneL Researcn Section (Starr), ApJUTANT GENERAL’s OrFice, Wa! 
DeparTMENT. PRS Report No. 1053. Survey of the Socio-Economic Level a: 
Post-War Educational Plans of Approximately 8000 Enlisted Men Assign: 
to the ASTP Basic Engineering-1 Curriculum at 21 Training Centers. Septen 
ber 1944, 

. PersONNEL Researcu Section (Starr), Apyutant Generat’s Orrice, Wai 
DreparTMENT. PRS Report No. 1066. Validation of the Mathematics Inventor) 
A510-2, Arl as a Predictor of Success in Term 1 of the Introductory and Bas 
ASTR Curricula. March 1945. 

. PersonNEL Researcw Section (Starr), ApyutaANnt Generav’s Orrice, Wa! 
DEPARTMENT. PRS Report No. 1067. Standardization of ASTRP Qualifying 
Test C-4. April 1945. 
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CHAPTER V 


Research For or By the Armed Forces 


JOHN C. FLANAGAN and DOROTHY B. BERGER 


Selection Research on Mental Adjustment 


Researcu in the field of mental adjustment in the Armed Forces con- 
sisted of (a) selection studies which used tests mainly of the paper-and- 
pencil variety to appraise the emotional stability of servicemen and (b) 
studies which attempted to make the same assessment on the basis of psy- 
chiatric interviews, situation tests, and other screening procedures. 

The use of the Rorschach in evaluating military personnel was discussed 
by Linn (134) in a study in which Rorschach records obtained from a 
group of enlisted men assigned to a hospital were compared with per- 
formance ratings a year later after eight months of overseas duty. Re- 
sponses given by well-adjusted soldiers were markedly different in many 
respects from norms based on well-adjusted civilians. The hypothesis was 
advanced that personality constriction and regression were produced by 
military indoctrination. 

A group of papers on the construction, standardization, application, 
and results of research on the Cornell Indices and the Cornell Word Form, 
included a report by Mittlkemann and Brodman (159) suggesting that the 
Cornell Service Index, Selectee Index, and Word Form were designed to 
differentiate quantitatively individuals with personality and psychosomatic 
disorders and to facilitate qualitative diagnosis of these disorders. A report 
by Weider and Wechsler (261) discussed the results of the application of 
the Cornell Indices and Word Form, the criteria of significant answers, and 
validity data. Wolff (266) indicated that with their basis of clinical expe- 
rience and psychological and psychiatric principles the Cornell instru- 
ments might be used at induction stations, clinics, neuropsychiatric wards, 
and medical and surgical wards in hospitals, or in industry, veteran place- 
ment, research, and hospitals and clinics in civilian work. Harris (78) 
discussed the use of the Cornell Selectee Index as an aid and timesaver in 
the psychiatric diagnosis of naval personnel. 

The Personal Inventory was discussed by Shipley and Graham (199) 
who presented a report and complete bibliography of the work on that and 
other tests of emotional stability. Satter (188) reported a study in which 
the results of the Personal Inventory were compared with success and 
failure in parachute school. Satter (187) also discussed the inability of 
the Personal Inventory, the Otis Tests of Mental Ability, the Two-Hand 
Coordination, several other tests, and psychiatrists’ evaluations to predict 
officers’ ratings of enlisted men in the submarine service. Shipley, Gray, 
and Newbert (200) found that the Personal Inventory differentiated be- 
tween discharges from the Navy and men still active in the service after 
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one year, between good and bad conduct cases, and between those rated 
and those not rated. Berry, Leavitt, and Mote (22) compared formats | 
and B of the Jnventory and found them to be parallel. 

Wexler (215, Chapter 9) discussed the measures of personal adjustment 
developed and used in the Bureau of Naval Personnel and Watson (257 
the prognostic value of the psychological tests in the Navy Officer Training 
Program. 

Selection on the basis of neuropsychiatric screening was described }) 
Southworth (211) who presented data on rejections and factors influencing 
rejections at the Great Lakes Naval Training Station. Newman, Bobbitt. 
and Cameron (162) reported on a reliability study of an interview by tw: 
psychologists and one psychiatrist for the evaluation of officers in thy 
Coast Guard. Biserial correlation coefficients were reported for failures 
to graduate from Submarine School for four tests developed thru the us: 
of psychiatric criteria by Bartlett (16) along with evidence showing th 
relationship between clinical evaluations and school failure. 

The psychobiological screening procedures in the War Shipping Ad- 
ministration were discussed by Killinger and Zubin (102) who pointed 
out that the screen caught 85 percent of those who would eventually hay: 
to be disenrolled. The effectiveness of battle-noise equipment as a test 
for emotional stability was evaluated by Hartley and Jones (79). 

The selection of workers for strategic services was described by the Offic 
of Strategic Services Assessment Staff (229). In addition to a vocabulary 
test, a sentence completion test, a health questionnaire, a work conditions 
questionnaire, and a personal history form, the process included several 
outdoor tests such as the Brook Test, the Wall-Scaling T est, the Constructio: 
Test, and several paper-and-pencil tests such as the Map Memory Test, the 
Bennett Mechanical Comprehension Test, and the Manchuria Test of Propa- 
ganda Skills. Murray and MacKinnon (160) pointed out that altho 
follow-up has been completed, only one of the 300 cases selected by th: 
OSS staff failed because of a neuropsychiatric condition. 

Steinberg and Wittman (212) discussed a study of the sociological, per- 
sonality, and adjustment characteristics of hospital patients who sup- 
posedly broke under camp life, of veterans in a mental hospital, and o! 
well-adjusted soldiers. A study of the interests of Marine Corps women 
as measured by the Kuder Preference Record was reported by Hahn and 
Williams (77). Adams and Fowler (1) presented a report on the reliability 
of two forms of an activity preference blank used to select fire controlmen. 


Selection Research on Intelligence 


Research in the selection of personnel on the basis of intelligence in- 


cluded a group of studies on the Wechsler-Bellevue Test. The value of five 
of the subtests of the Bellevue verbal scale in differentiating among nor- 
mal, dull-normal, borderline, and mentally deficient groups in the exami- 
nations of naval recruits was discussed by Lewinski (113). Altus (7) con- 
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cluded, from a study of the validity of the Wechsler-Bellevue, that the 
validity is somewhat higher for the total scale than for Form B of the 
scale. Hunt (86) considered the use of a ten-minute individual test of 
intelligence, the Wechsler-Bellevue, reading and language handicap tests, 
the Rorschach, and various educational tests for selecting naval recruits. 
Correlations between the original and revised Kent Emergency Scales, and 
between the Kent and the Stanford-Binet, and the Wechsler-Bellevue were 
discussed by Lewinski (114). Greenwood, Snider, and Senti (69) described 
a study of the correlation between the Wechsler Mental Ability Scale, Form 
B, and the Kent Emergency Test administered to 200 Army personnel. A 
correlation of .74 + .02 was found between the two tests and it was con- 
cluded that the Kent is suitable for intelligence testing in situations not 
permitting more extensive testing. 

A study was presented by Lindsley (121) which indicated that students 
with an Otis Intelligence Test score of —1 or less would fail in the filter 
course at Camp Murphy, Florida. Colmen (33) described a five-minute 
group test which was found to be adequate and reliable for measuring 
intelligence without being influenced by illiteracy. 


Research in the Selection of Officers 


Jensen and Rotter (91) reported that of thirteen psychological tests 
investigated as screening instruments, the most efficient combination for 
predicting academic success was the Personnel Test (Wonderlic modifica- 
tion of the Otis Higher Examination), the Arithmetic Computation (Stan- 
lord Achievement Test, Advanced), and the Combined Paragraph- and 
Word-Meaning sections of the Stanford Achievement Tests. 

A program consisting of an interview in which past history rating was 
obtained, a standardized life-like construction test which yielded ratings 
on seven basic traits related to combat leadership ability, a specially 
devised sensori-motor test, a rapid projection test, and the group use of the 
TAT, was discussed by Murray and Stein (161). 

An analysis of the records of two classes at Fort Sill Artillery School 
by Garrett and Ligon (67) revealed that combat efficiency was not very 
closely related to ratings for leadership obtained in OCS, that there was 
some indication that the best officers came from age range 22-28 and that 
above a certain desirable minimum, intelligence as measured by the GCT 
had little relevance to combat performance. 

A group of studies on the selection of officers for the Navy included re- 
ports by Cornehlsen (38, 39, 215) on the growth of the selection and 
classification program for officers and for reserve officers for billets; by 
Miller and Owens (215) on the Basic Tests for Officer Personnel; by Fred- 
eriksen and Peterson (65) on the development and validity of the Navy’s 
Officers’ Qualification Test; by Gulliksen (70) outlining the specifications 
for an Officers’ Selection Test; by Frederiksen (58) on the comparison 
of the Officers’ Qualification Test, Form 1, and the U. S. Navy Aptitude 
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Test, Form E-2; by Gulliksen (72) on the preparation of Form 1 of th, 
Navy Officer Qualification Test showing that the revised test had a wide 
range of difficulty and a satisfactory reliability; by Frederiksen (6]) oy 
the preparation of norms for the Officer Qualification Test, Form ]: }y 
Gulliksen (71) on the preparation of norms for women on the Office; 
Qualification Test, Form 1; by Peterson (172) who gave a statistical evalua. 
tion of the Navy Officer Qualification Test, Forms 2 and 3; and by Frederik. 
sen (60) who discussed the preparation of a spatial relations test, consistins 
of multiple-choice items concerning the rotation of solid forms, for selectins 
radar officers. Conrad and Lannholm (215) described the prediction of 
success in Primary Officer Training School and Maucker (215) described 
such prediction in Advanced Officer Training School. 


Research in the Selection of Enlisted Personnel 


The selection of enlisted men in the Navy was discussed in a group of 
studies. Odell (215) gave an account of the growth and development of 
the selection, qualification, and classification programs. Bond and Miller 
(215) described the development of the Basic Test Battery. The Staff of 
the Bureau of Naval Personnel (219) presented studies on item analyses, 
time limits, reliabilities, norm development, validity, intercorrelations, and 
factor analyses of the Basic Test Battery. The Staff of the Bureau (22! 
also discussed the validity of Form 1 of the Basic Test Battery for selection 
for two types of elementary training schools. Bloom and Brundage (215) 
described the prediction of success in elementary enlisted schools. Curtis 
(215) described prediction in the advanced schools. Satter (187) found 
that there was no relationship between the scores on the Otis Higher Fx. 
amination, the Personal Inventory, and the Two-Hand Coordination Test 
and officers’ ratings of submarine crewmen on the job. Graham, Mote, and 
Berry (68) found that the same battery predicted “iank escape” perform- 
ance failures considerably better than chance for submarine crewmen. 
Miller (157) reported a study on the choice of a test battery for selection 
of LCVP coxswains thru the use of the Wherry-Doolittle test selection 
method. Miller (156) also described reliability studies of six apparatus 
tests used with the Navy Basic Battery for the selection of LCVP coxswains. 
and in another study (154) reported that the Navy finally chose a hand 
dynamometer and a pegboard test from this group of mechanical tests 
on the basis of correlations with ratings on boat handling. 


Research in Selection of Enlisted Personnel 
for Particular Jobs 


Studies concerning the selection of fire controlmen and radar operators 
included studies of vision such as the one by Adams, Fowler, and [mus 
(3) which discussed the relationship between visual acuity and acuity o! 
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stereoscopic vision. Adams and others (4) found that the Ortho Rater 
Tests were sufficiently reliable as testing devices in the selection of candi- 
dates for training as fire controlmen and range finder operators. The inter- 
relationships among seven tests of stereoscopic acuity and the relationship 
between two tests of visual acuity and two tests of Phorias were pointed 
out by Fowler, Imus, and Mote (56). The battery used in the selection of 
fre controlmen, range finders, and radar ‘operators was discussed by Beier 
and others (21). Imus (88) presented the directions, procedures, tests, 
and equipment used in the selection and classification of fire controlmen. 
The final report on the Selection and Training of Radar Operators was 
made by Lindsley (117). 

The Staff of the Bureau of Naval Personnel (222) described the con- 
struction of selection and achievement examinations and the conduct of 
technical personnel research designed to facilitate the selection and train- 
ing of personnel in the maintenance and repair of electronic equipment. 
The predictive efficiency of the Navy Basic Test Battery at gunner’s mates 
school was discussed by the staff of the Bureau of Naval Personnel (226). 

Methods of selecting naval gun and engineering crews were discussed 
in a complete summary and bibliography of the Gunnery Project of the 
Applied Psychology Panel by Viteles, Gorsuch, and Wickens (242). 
Rogers, Viteles, and Voss (180) presented similar material for the Ap- 
plied Psychology Panel Engineering crew project. 

McQuitty (139) described the personnel selection program at an Engi- 
neering Replacement Training Center which was continuous, being co- 
dinated with the training program, and was based upon the AGCT score, 
formal education, and first and second best civilian occupations. Selection 
ior specialists’ courses was on the basis of interest, success in a specialty, 
related hobbies, educational background, and aptitudes. 

The work on the selection and training of night lookouts, including a 
discussion of validation of old night-vision tests, of the measurement of 
the performance of night lookouts at sea, and of an analysis of the night 
lookout’s job, was summarized by Wedell (259). In a study of hearing 
in searchlight and other personnel requiring exceptionally good hearing, 
Clarke (31) reported that audiometers were unsuited for mass testing of 
acuity of hearing and suggested that gramophone records with words of 
varying intensities be employed. The study reported that only 40 percent 
of the 100 soldiers trained as listeners or spotters appeared to have, for 
both ears, hearing superior to three decibels loss. The best combination 
of tests for selecting Army weather observer students was reported by 
Cleveland, Faubion, and Harrell (32) to be a mental alertness test, a 
physics achievement test, and a meteorological achievement test. 

Kurtz, Seashore, and Willits (107, 105) presented a discussion of the 
Code Receiving Tests developed by the Applied Psychology Panel. 

Reid (178) reported that in aptitude tests for drivers in the Third 
\rmored Division, the greatest number of failures occurred for glare blind- 
ness, defective acuity, and depth perception. A yarn test for color vision, 


619 








REVIEW OF EDUCATIONAL RESEARCH Vol. XVIII, No. ¢ 





a field of vision test, and tests of depth perception, glare, balance, stability. 
reaction time, and visual acuity were given to 10,000 prospective heavy. 
truck drivers in this study. The selection and training of cargo handling 
teams for combat-laden vessels was discussed by Ruch (185). 

Selection tests and causes for rejection were described by Thomas | 
in a report on the selection of parachutists. 

A report on the construction of various performance tests, group and 
individual, and of checklists for objective observation in the Teacher 
Training Department of the Armored Force School, Fort Knox, was pre. 
sented by Siro (202). 


294 


7 


Research on Classification and Aptitude Tests 


Research in classification in the services included studies such as those 
McCain and Schneidler (135) discussed concerning the classification and 
selection of enlisted personnel. Eurich and McCain (50) also described 
the initial classification in which each recruit was given the general classifi.- 
cation, reading, arithmetic reasoning, mechanical aptitude, clerical and 
mechanical knowledge tests, an interview, and then a recommendation for 
two possible jobs. The specifications for the construction of a general 
classification test, a reading test and a test of arithmetical reasoning were 
outlined by Frederiksen (62). The selection of items and a comparison 
of the selected items with those previously used for the General Classifica- 
tion Test, Form 2, and for the tests of reading and arithmetic was described 
by Satter (189). Wrenn (270) included in his description of Navy per- 
sonnel procedures the nature of the classification interview and the train- 
ing of interviewers. The derivation of national norms for the Fleet Edition 
of the General Classification Test was described by Peterson (171) who 
concluded from the data collected that the GCT (x-l-s) served satisfac- 
torily as a self-administering test and constituted a parallel form of the 
GCT, Form 1. He (170) also reported on.a factor analysis of the new 
Navy Basic Classification Test Battery. A statistical evaluation of the Basic 
Classification Test Battery, Form 1, led Conrad (35) to the conclusion 
that the battery competently fulfils the essential requirements. 

Procedures and tests used to select men for assignment to fill the balance 
crews for newly commissioned destroyers on the Pacific Coast were dis- 
cussed by Levin (111). The work of the Classification Section of the 
Armored Force Replacement Training Center described by Wittman (264) 
included material on the psychological, clerical, and mechanical-aptitude 
testing; on occupational interviews, testing and classification; on assign- 
ment to different military duties; on the selection of officer training candi- 
dates; on the liaison relationships with regular training companies; on the 
record keeping and planning of activity flow; and on research and selection 
of men for the Special Training Unit which handles and studies physical, 
mental, and psychological problems. Malone (142) described the Army 
Classification system, the use of the AGCT, the Mechanical Aptitude Test, 
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and the Radio Operators Aptitude Test. An activity preference test furnish- 
ing a number of scores corresponding to clusters of functionally related 
activities was reported by Kelley (100) who outlined the steps leading to 
its development. The selection of personnel with superior vision for the 
crew of the USS New Jersey was described by Verplanck (231) in a study 
using an NDRC Adoptometer Model II. 

Research on aptitude tests for the Navy was reported by Frederiksen 
(63) who described a study made on an experimental battery of aptitude 
tests as predictors of service-school grades for inclusion in the Basic 
Classification Test Battery; by Conrad (36) who presented the research 
and developmental history of the Navy’s aptitude testing program; by 
Conrad (34) who also discussed the basic statistical facts concerning indi- 
vidual items of the Navy Aptitude Tests and interpreted these facts with 
reference to various problems; by Stuit and Feder (215) who described 
the development of special aptitude tests by the Bureau of Naval Personnel; 
by Gulliksen, Conrad, and Frederiksen (75) who confirmed an earlier 
conclusion, by studying the averages, standard deviations, and intercorre- 
lations of the Navy aptitude tests, that variations in procedure from one 
station to another constitute a serious problem; by Gulliksen (73) who 
compared the selection of test items for a mechanical comprehension test 
by an item analysis based on an external criterion and by the technic of 
item-total correlation and also (74) presented minor modifications which 
could be made in a short time to the Navy Mechanical Aptitude Test, 
Form T, and made suggestions for a more thoro revision of the test. 

Validation research was presented by Frederiksen (64, 63) in a dis- 
cussion of the validities of aptitude tests at various schools; by Crawford 
and Burnham (45) in a report on the results of the educational aptitude 
testing of V-12 students in which it was found that aptitude tests proved 
to be effective predictors of academic work measured by objective achieve- 
ment tests; and by Anderson and others (10) in a paper on the Oscilloscope 
Operator tests. Prediction of ability was discussed by Kurtz (104) in 
relation to code learning. Prediction of success in Electricians’ Mates 
School was presented by Conrad and Satter (37) in a discussion of the 
use of test scores and quality classification ratings. A report was given by 
Smith and Voss (206) on a study of the effectiveness of the classification 
procedures for officers of the Amphibious Training Command. Smith and 
others (207) also reported on the effectiveness of classification data in 
predicting billet performance in training in the Amphibious Force. Predic- 
tion of success in service school from the order of assignment was discussed 
by Satter and Conrad (190). A study presented by Wedell (260) reported 
on the prediction of the performance of night lookouts. 

Procedures used in a job analysis of the tasks performed at gun stations 
were enumerated by Viteles and Smith (243). The final report and sum- 
mary of work in job analysis qualification and placement of personnel in 
the Amphibious Force was presented by Smith (205). 

A selectometer for weighting the qualities on which interviewers rate 
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men was discussed by Keislar (93) and by Campbell (30) who made a 
final report on research and development of classification aids. Another 
classification aid, the point-score method for evaluating Naval personne] 
was presented by Levin (112). Viteles (233) presented an interviewer's 
recommendation chart which shows in visual form the billets aboard ship 
for which an individual with given qualifications is most adapted. The 
personal preference technic, which employs the opinions of co-workers or 
students who have had adequate time to observe their fellows, was dis. 
cussed by Wiggin and Bartlett (263) as a possible supplement to in- 
structor’s grades. 


Research on Training 


In an article on military training and learning theory, Wolfle (267) 
indicated that help was given to military specialists in World War II by 
psychologists who applied such principles of learning as distribution of 
practice, active participation, variation of material, accurate records of 
progress, knowledge of results, and systematic lesson plans. Applications 
of these theories appeared in some of the work discussed below. 

Gunnery-training research included such studies as the ones discussed 
by Viteles et al (235, 245, 246, 249, 251, 252, 253), outlining training 
aids, lesson plans, and courses of instruction for a four-day course in 
20 mm and 40 mm gunnery; by Viteles (234) who investigated the 
scoring characteristics of the Machine Gun Trainer, Mark 1; by Smith et al 
(208) presenting a memorandum on gunnery teaching; by Covner and 
Viteles (40) presenting instruction in engineering, damage control, and 
gunnery at the CVE Precommissioning School; and by Viteles, Gorsuch, 
and Wickens (241) describing the standardized four-day courses using 
unit lesson plans for a gunnery-training program, and the study of sy: 
thetic training devices. Range-estimation studies included Wickens, Gor- 
such, and Viteles’ (262) account of lesson plans for instruction on the 
mirror Range Estimation Trainer Device 5C-4; Voss and Wickens’ (255) 
comparison of free and stadiametric estimation of opening range; Horo- 
witz and Kappauf’s (84) description of the accuracy of unaided visual 
range estimation for aerial targets at ranges between 1500 and 8000 
yards; Viteles et al’s (248) analysis of the results obtained in training 
men in range estimation on the firing line; and by Rogers’ (181) evalu- 
ation of methods of training in estimating a fixed opening range. Hoffman 
and Mead (82) discussed the performance of Anti-Aircraft Artillery per- 
sonnel on a complex task of four-hours duration. Research in the training 
of engineers was discussed by Rogers, Viteles, and Voss (179, 180) ; and 
by Viteles, Gorsuch, and Watters (240) who discussed the improvement 
of a training program for newly organized crews for destroyers and for 
auxiliary ships. Masoner and Watters (149) presented an instructor's 
manual which served as a guide in the administration of special engineer- 
ing courses. A manual for training balance crew engineers for attack 
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transport vessels was described by Covner, Gorsuch, and Viteles (41). 
Viteles and Gorsuch (236) prepared a memorandum on effective teaching 
methods for engineering instructors. A group of studies concerned with 
progressive engineering were reported by Viteles and Gorsuch (237, 238) 
who presented lesson plan outlines for Stages I and II of the instruction; 
by Organist et al (167), who prepared an instructor’s manual for Stage II 
and who presented outlines for instructions in Stage III (165); and by 


Organist and Willis (166) describing the organization and instruction 
for Stage III. Covner et al (44) prepared an instructor’s manual for pre- 
senting information on the distilling plant to engineering personnel. 

The training of fire controlmen and range finder operators was pre- 
sented by Beier et al (20) in the form of a series of lesson plans. The in- 
fluence of visual tasks in the training course of fire controlmen upon their 
visual proficiency was discussed by Adams, Beier, and Imus (2). Covner, 
Gorsuch, and Viteles (42) presented a manual for instructors with a 
detailed step-by-step procedure for operation of fireroom equipment on 
destroyer escort vessels. 

The training of radar operators was reported on by Lindsley et al (132), 
who discussed the use of the Philco trainer for A-scan oscilloscope oper- 
ators; and by Lindsley et al, in a series of articles (120, 128, 133), 
describing and presenting recommendations and generalizations for the 
use of the PPI flash-reading and tracking trainers in training Navy search- 
radar operators. Lindsley (119) also gave an account of the results of a 
study determining the effectiveness of the course of training SCR-270-71 
radar operators. The effectiveness of the Foxboro Trainer in training 
oscilloscope operators to track by means of pip-matching was evaluated by 
Lindsley and others (129), who also made recommendations for its use 
(127). A study of the SCR-584 basic trainer as a device for teaching range 
tracking was presented by Lindsley et al (130) and by Anderson et al (9). 
The Lufts Tracking Trainer was described by Hudson and Searle (85). 
The results of developmental work done on the design and construction 
of a director tracking trainer and experiments to determine the effects of 
various fatiguing circumstances on performance were summarized by 
Mead (151). Kappauf (92) reported on phototube scoring devices for 
tracking trainers. An experimental investigation involving a comparison 
between tracking to a fixed hairline and tracking to a rotating hairline was 
presented by Lindsley et al (124). Experiments in training radar operators 
in visual code reception were discussed by Anderson et al (8) and by 
Lindsley et al (125). The use of radar scope movies for briefing and 
reconnaissance purposes was evaluated by Lindsley (122). A study of 
performance was reported by Lindsley and others (131) concerning the 
reactions of radar operators under speed stress. He also described the 
factors determining the accuracy of reading oscilloscope code in a study 
(126) designed to find the speed, width, amplitude, dot-to-dash ratio, and 
letter-code cycles at which code can most accurately be read. 

An extensive group of studies concerning the training of telephone talkers 
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was discussed by Black and Mason (24); by Snidecor, Mallory, and 
Hearsey (210) in relation to the use of mass drill, continuous prompting. 
instruction by skilled men, dramatic recording, criticism, and discussion 
as methods of training telephone talkers for increased intelligibility; }, 
Hibbitt and Mallory (81) in relation to an experimental investigation of 
a course for telephone talkers; by Curtis (46) in relation to increasing the 
intelligibility of voice communication by training in voice technic and 
(47) in relation to the use of noise in a training program; by Abram: 
and others discussing the factors determining the intelligibility of speech 
in noise; and by Mason (146) concerning the effects of training on articu 
lation. Mason and others (147) reported on the indoctrination of ai: 
crewmen in voice communication at altitude and (148) on the training 
studies in voice communication. Studies of the effect of pitch on th 
intelligibility of voice communication were discussed by Mason (145). 
The relationship between loudness and intelligibility of airplane inter. 
phone communication was pointed out by both Curtis (48) and Talley 
et al (218). Reports on the analyses of mistakes made in word intelligi- 
bility tests over the T-17 microphone (144) and on the phonetic char- 
acteristics of words as related to their intelligibility in aircraft type noise 
(143) were made by Mason. Intelligibility in relation to various methods 
of holding the T-17 microphone for communication in noise was dis- 
cussed in a report of the Psychological Corporation (177). Talley, Curtis, 
and Haagen (217) reported on a related study on microphone position in 
voice communication. Snidecor (209) dealt with a preliminary study of the 
ability of rated men to judge speaking performance. Anonymous articles 
gave accounts of a study in training Classification Petty Officers to select 
telephone talkers (14) and of a speech interview for the selection of tele- 
phone talkers (13). The final report in summary of the work on the selection 
and training of telephone talkers was made by Mallory and Temple (14! 
An account of the technics and procedures used by the Voice Communi- 
cation Laboratory was presented by Haagen (76). The final summary of 
work on voice communication was given by Black (23). 

Reports on training studies in radio code work include a summary of 
research in Radio Code Project N-107 by Kurtz and Seashore (106) and 
in Project SC-88 by Keller (95); a comparison of training methods a! 
two levels of code learning by Keller, Estes, and Murphy (98); reports 
by Keller and Estes (96, 97) on the effectiveness of different types of 
practice in code learning and by Keller (94) on the code voice method of 
teaching; a comparative study of three methods of teaching code in the 
early weeks of the course by Seashore and others (195) ; and a discussion 
of the standardization of code speed by Kurtz, Seashore, Stuntz, and 
Willits (108). The development of a graduation and rating test for 
Class A radio schools was discussed by the staff of the Psychological Cor- 
poration (176). A group of four studies concerning methods to be used in 
code classes included Seashore and others’ (197) discussion of variation 
of activities to prevent monotony in code classics; their (196) report on 
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the effect of introducing sending code early in the course upon learning 
to receive; Seashore and Stuntz’ (194) manual of activities for reducing 
monotony in code schools; and their (193) experimental study of the train- 
ing of radio operators to copy code thru interference. Seashore and Kurtz 
(192) presented an analysis of the errors made in copying code. 

A miscellaneous group of training studies included a report by Ruch 
and others (186) outlining training procedures and lectures on winch 
operations and presenting a rating form for grading trainees on electric 
winch operation; a critical evaluation by Shuttleworth (201) of the Army 
Specialized Training Program with reference to selection standards and 
the method of “block training”; a discussion by Layman and Boguslavsky 
(110) of the relationship between ability and achievement in the Army 
Specialized Training Program which pointed out that “neither secondary 
schools nor colleges were sufficiently challenging to induce maximum 
relationship between ability and academic achievement in many individual 
instances”; a presentation by Carstater (215) of the Bureau of Naval 
Personnel program for officer training and one by Batchelder (215) for 
enlisted personnel. Feder (53) reported standardization of instruction in 
several Navy schools concerned with elementary electronics training thru 
the construction of an achievement test and the standardizing of procedures 
on the basis of test results. 


Training Devices 


Discussions of studies concerning training devices included Exton’s (52), 
Noel’s (163), and Stott’s (213) accounts of the use of audio-visual aids 
in expediting the Navy training program. Wattles (258) presented the 
results of the teaching of gunnery with aids such as flash cards, films, 
rating sheets, lesson plans, and observation record forms for evaluating 
the instructor. Ullman (227) described the procedures of several night- 
vision training devices. Lanier (109) explained a night lookout trainer 
for use aboard ship. Dresser (49) examined the use of slide films in the 
Navy training program. Witty and Goldberg (265) discussed the use of 
flash cards, training films, film strips, picture portfolios, bulletin boards, 
posters, cartoons, maps, diagrams, charts, and other visual aids in special 
training units in the Army. An anonymous article (90) described the 
shortcuts in learning skills, ways to speed training, study books, manuals 
and lessons, and other aids to military training. Thomas (225) reported 
on the use of animated cartoons in training and indoctrination in the Army. 
A discussion of ship models in classroom instruction and other training 
aids was presented by Viteles and Gorsuch (239). 

Viteles and others (250) presented a discussion of the psychological 
principles involved in the design and operation of synthetic trainers with 
particular reference to anti-aircraft gunnery. Viteles and others (247) 
also described an investigation of the Range Estimation Trainer Device 
5C-4 as a method of teaching range estimation. The use, characteristics, 
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advantages, and disadvantages of all the synthetic trainers used by thy 
Applied Psychology Panel projects were compared with training on real 


equipment by Wolfle (268). 


Morale Research 


Discussions concerned with the factors affecting morale included 4 
report by Madigan (140) which emphasized the difficulties in army adjust 
ment and the ways in which morale problems might be countered; one 
Prattis (173) concerning the morale of the Negro in the Armed Services 
under the treatment received; and one by Evans (51) and one by O'Gara 
(164) pointing out some factors affecting military morale. Blain (25 
discussed the war neuroses of merchant seamen and the personal and 
morale factors involved in their etiology and prevention. Homans (83 
reported on the problems in morale and leadership on small warships. 
Woods (269) discussed the morale factors of naval noncombatants: 
Baganz, Mearin, and Woods (15) presented an account of the mental 
mechanisms and morale factors of Naval recruits in training. An anony- 
mous writer (89) summarized the points mentioned by soldiers as the 
features of army life most closely related to morale. A consideration of the 
development of rumors in the service and the ways of checking them was 
presented by Kelly and Rossman (101). 

Discussions of the factors which build morale included a presentation 
by Bassan (17) of factors found valuable in maintaining morale on a 
small combat ship; an account by Smith (204) of the personnel policy 
of the Navy and its relation to morale; a report concerning the problems 
of procurement, training, and morale among members of the Women’s 
Reserve of the U. S. Coast Guard by Stratton and Springer (214): a 
description by Rose (182) of the bases and weaknesses of American mili 
tary morale in World War II; considerations by Schroeder (191) and 
by Kreinheder (103) of the orientation program in the Army and th 
qualities of good orientation officers; an anonymous article (153) con- 
cerning planned orientation for combat, orientation objectives, and the 
execution of the orientation course; a presentation by Brosin (28) of 
a program for utilizing the marginally unfit in the Armed forces and 
an analysis of the basic principles involved in morale improvement; a 
description of an analysis made of the morale of American occupation 
troops before and after the end of World War II and means of improving 
military morale by Warner (256); and a report by Rottersman (183 
based on the analysis of 20,000 selectee questionnaires regarding com- 
plaints, on morale as a factor in complaint reduction. 

Civilian research in morale included Allport and Schmeidler’s history 
(6) of a clearing house to aid psychologists in problems of morale. Shils 
(198) discussed the effect of governmental investigation on attitudes and 
morale. Appel and Hilger (12) presented a morale and preventive-psy- 
chiatry program in the Army. Osborn (168) summarized the services of 
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the Morale Branch of the War Department as they affected the recreation, 
welfare, and morale of the American soldier. 


Leadership Studies 


A suggestion was made by Miller (158) that leadership could be taught 
by actual training under officers who are themselves good leaders and 
by experiencing leadership and its problems. A discussion by Metsker (152) 
included material on the mental characteristics of military leadership from 
the standpoints of selecting and training leaders. Mayberry (150) de- 
scribed an interview rating scale and technics employed in evaluating 
leadership qualities of officer candidates. McNassor (138) and Bavelas (18) 
discussed the training of leaders, and MacKechnie (136) reported on the 
development of leadership in small unit commanders. An outline syllabus, 
used as an aid in the Academy’s first course on the psychology of military 
leadership, was presented by the U. S. Military Academy at West Point 
(228). Intangible factors in combat, including teamwork and leadership 
were considered by McLain (137). Garrett and Ligon (66) in a report 
on combat leadership concluded that unless leadership is defined in some 
way which permits direct measurement of specific qualities, research on 
predictive items is likely to be useless. Ligon (116) discussed the problems 
of choosing efficient officer candidates, reports from combat area, and 
interviews with ex-combat officers concerning the characteristics of good 
combat leadership. A study of traits most frequently mentioned for a 
good officer and for distinguishing a good officer from a good enlisted man 
was reported by Heath and Gregory (80). Ageton (5) presented a dis- 
cussion and bibliography on military leadership and training methods. 
The development of a manual for instructors of leadership courses in Offi- 
cers’ Training School was presented by the Staff of the Bureau of Naval 
Personnel (29). The OSS Staff (229) gave an account of the measurement 
of leadership in life situation tests such as the Mined Road, Getting Past 
the Sentry, The Blown Bridge, and Killing the Mayor, where candidates 
were assigned leadership and expected to lead a group of men. 


Proficiency and Achievement 


Problems in the measurement of achievement in Naval Training Pro- 
grams, the types of tests developed and the outcomes of the Achievement 
Examination Program were described by the Staff of the Bureau of Naval 
Personnel (220). A group of reports on achievement included Ryan’s 
(215) and Feder’s (54) discussions of the services provided to Navy 
Training thru achievement examinations; Porter and Harsh’s (215) 
presentation of achievement examinations for elementary enlisted schools; 
Feder and Lawrence’s (215) account of the measurement of achievement 
in the Radio Technician Training Program; and Cruikshank and Darling’s 
(215) description of the Advancement in Rating Examination developed 
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by the Bureau of Naval Personnel. Anderson and others (11) discussed 
vision as related to proficiency in oscilloscope operation. Lindsley (123 
wrote on the same topic and gave recommendations concerning minimuy 
visual standards for radar operators. Ruch (184) evaluated a subjective 

an objective technic for rating winch operating ability. Prentice (174 
ported a study of the performance of night lookouts aboard ship. Ke! 
and Jerome (99) outlined a system for describing progress in receiy;) 
International Morse Code. The construction and validation of a work read 
ness test for distilling plant operators which served as an objective tech: 
for evaluation proficiency was described by Covner, Voss, and Wesle) 


and by Voss and Wesley (254). 


Criterion Measures 


Discussions of research on criterion measures included Bechtoldt’s (215 
and Patterson’s (169) articles on the problems of the criterion in pr 
tion; Sisson’s (203) description of the criterion in Army personnel r 
search and the results of an exploration of the “nomination” techni: 
possible criterion of soldiers’ performance, which showed a correlation 
the order of .50 between scores on a selected test battery for enlisted m: 
and high and low “nominations” by fellows for competence; and Vaughn’ 
(230) discussion of this same technic which gave evidence of value as a 
criterion in exploratory studies with Navy pilots. 


Methods of obtaining criteria of shipboard competence appeared i: 
discussion by Bechtoldt, Maucker, and Stuit (19). Franzen presented (5 
a method for selecting the best combination of dichotomous arrangements 
to distinguish a categorical criterion. The effect on prediction of succes: 
of an increasingly well-defined criterion was described in an artic 
Stuit and Wilson (216). Miller (155) presented a discussion of the sel: 
tion and reliability of a criterion of proficiency in operating the LCVP. 
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CHAPTER VI 


Wartime Research in Psycho-Acoustics 


MARK R. ROSENZWEIG and GERALDINE STONE 


Tae PRESSING demand for effective voice communication during the war 
stimulated widespread research in psycho-acoustics—the application of 
psychological methods to problems of acoustics, speech, and hearing. Exist. 
ing communication equipment and technics had to be tested; new equip. 
ment and technics had to be designed. Typical wartime noises had to be 
measured and studied, their effects evaluated, and some of them combatted. 
Human factors in communication had to be determined, utilized, and al- 
lowed for. Some of the studies directed toward these problems are mentioned 
in this chapter. 

Summaries of the extensive research on psycho-acoustics of the Applied 
Psychology Panel are contained in the book, Human Factors in Militar) 
Efficiency—T raining and Equipment, by Wolfle and others (135) and in 
two reports by Black (18), and by Mallory and Temple (87). The work 
of the Psycho-Acoustic and Electro-Acoustic Laboratories, Harvard Uni- 
versity has been summarized by Miller, Wiener, and Stevens (97). This 
book includes references to relevant work performed at other laboratories 
and considerable background information. 


Voice Communication 

Basic to the investigation of speech material, communication personnel, 
and communication equipment was the method of “articulation tests” (30, 
31, 59, 60, 61, 111, 126). This was a method of testing communication 
systems by determining how well they serve to transmit speech. Carefully 
chosen speech items were employed; the proportion of items correctly re- 
ceived provided an indication of the relative effectiveness of the system. 
For any devices under consideration, as, for example, microphones, ar- 
ticulation tests could be used to indicate the relative effectiveness of differ- 
ent possibilities: carbon microphones, dynamic microphones, and magnetic 
microphones. Alternatives to the formal articulation test were abbreviated 
testing methods (31), subjective appraisal of intelligibility (6, 31), and 
threshold methods for evaluating intelligibility (31). 


Speech Material 


The type of speech material used was found to be an important factor 
in intelligibility. Analyses were made of the phonetic characteristics of 
words as related to their intelligibility (1, 3, 6, 18, 89, 90). Recordings 
of messages made in combat situations were analyzed to provide informa- 
tion about common errors and failures of communication (7, 18). Intelligi- 
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bility involves not only the physical characteristics of speech material 
(acoustics spectra) but also such characteristics as the average number of 
sounds per word, the relation of a word to other words in the language, 
and apperceptive variables (97). On the basis of such considerations, there 
were tested and constructed highly intelligible vocabularies, phonetic al- 
phabets, standard forms of command, and lists of call signals and telephone 
directory names. Various procedures for the pronunciation of numerals 
were also tested (1, 2, 3,5). 


Distortion 


Distortion and interference are also factors in intelligibility. Experiments 
on amplitude distortion were performed with nonlinear circuits which 
either clipped the peaks or the center of the signal, or rectified the signal. 
The effects of each type of distortion on intelligibility were determined by 
articulation tests, and various measures of distortion were compared for 
their relation to the impairment of intelligibility (78). The effects of adding 
noise, both before and after distortion, were studied. Peak clipping (58, 
73, 75, 76, 78, 79, 81, 85) was found to improve the intelligibility of a 
signal if measurement was in terms of peak voltages. Such peak clipping 
may be used to advantage in hearing aids to protect the ear at high in- 
tensity levels, in AM radio transmission to allow continuous 100 percent 
modulation, and in radio telephony to improve intelligibility when static is 
present. Center clipping and rectification, on the other hand, were found 
to be detrimental to intelligibility. 

The effects of frequency distortion on intelligibility were investigated 
with the use of low-frequency cut-off (43, 125), gradual “tilted” cut-offs , 
(58), and band-pass filters (39, 40). Various levels and spectra of masking 
noise were used. Results showed that an adequate speech-to-noise ratio 
should be provided over as wide a frequency range as possible. For ideal 
speech transmission, the frequency range should extend from about 200 to 
7000 cps, and the signal-to-noise ratio at each frequency should be 25 db 
or more. Combined frequency and amplitude distortion was studied with 
speech material that was both “tilted” (i.e., put thru a system with an 
oblique response characteristic having a regular gain per octave) and peak 
clipped (58). 

The quality of speech was found to change at high altitudes (71, 80, 
113). Attempts to improve the intelligibility of speech at high altitude thru 
deliberate frequency distortion gave little success (96). Modifications of 
equipment resulted in improved performance (83). 


Interference 


Interference was a major problem of military communication, and an 
extensive range of signals and noises was investigated to determine what 
factors contribute to their effectiveness for masking (39, 48, 62, 93, 95, 
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112, 123, 125, 128). The important dimensions of interference are jts 
intensity, frequency, spectrum, temporal continuity, and annoyance value. 
Noise was found to mask best when it is uninterrupted and when it h is 
a broad spectrum with a signal-to-noise ratio that is constant at all fre. 
quencies. Greater annoyance is caused by interrupted and St brecucn 
interference. Pure tones do not mask speech effectively, but continu 
tones of low fundamental and rich in harmonics mask almost as wel! as 
noise, and they are more annoying. Speech can be used to mask othe: 
speech, but its spectrum and not its meaning is the chief factor in masking. 

The effect of interference was investigated also in the case of radio rang 
signals (50), and radar signals (53). 


Selecting and Training Personnel 


Since an effective communication system requires good talkers and listen. 
ers, research was conducted in selection and training of personnel. Stud 
were made on the rating of talkers (6, 15, 100, 109, 110, 120, 121) 
factors found to be closely related to intelligibility were loudness (21, 22 
and intensity control (100), and precise articulation (92). Factors showing 
a slight relation to intelligibility were pitch (91), voice spectrum (6), rat 
of speaking (18), telephone experience, education, listening ability. 
memory span (100). General American dialect was found to be slight}; 
more intelligible than Southern or Eastern (100). 

It was easier to test listeners than talkers, because standardized phono- 
graphic tests could be given to many listeners at once (24, 69, 107). Art 
lation tests given under relatively quiet conditions were found not to show 
who will listen well in noise (110). Noise generators were theref 
signed for testing and training programs (11, 12, 48, 101, 127). Exp: 
ments suggested the existence of an ability to listen in noise which is in- 
dependent of distortion due to particular equipment of the spectrum of the 
interfering noise, and of the type and mode of presentation of the speech 
material (69). Slight relation was found between listening ability and th 
following measures (69): code ability, intelligence as measured by the 
GCT, auditory memory span (4), and speaking ability. Listening ability 
was found to be somewhat related to region of residence (18). Experiments 
were also made to determine the nature and extent of individual differences 
in the detection of small changes in noise; tests were constructed for dis- 
crimination of pitch and loudness of noises (68). These have not yet been 
used in studying listening ability. 

Rapid and extensive improvement of performance was obtained with 
training of both talkers and listeners (18, 110). Several training programs 
were devised and tested (5, 16, 23, 24, 64, 88, 92, 99, 122) ; manuals and 
syllabi were prepared (8, 9, 10, 14, 99); special training equipment was 
designed and used (11, 98). The improvement found was attributed \ 
several factors; training in voice technic (23), training in the use of : 


munication equipment (6, 13, 32, 47, 104, 110, 129), and training in th 
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identification of words that are partially masked by noise or distorted by 
characteristics of the equipment used (24, 110). 


Communication Equipment 


Along with speech material and communication personnel, communica- 
tion equipment was studied in the over-all program of bettering military 
communication. The testing methods used by the Harvard Psycho-Acoustic 
and Electro-Acoustic Laboratories and the results they obtained are re- 
viewed in a technical summary report (97). 

Microphones were tested with the human voice and with artificial “voices” 
(17). The properties of earphones were tested by utilizing the responses of 
listeners; by measuring sound pressures at the ear canal with a probe tube 
(102) ; and by the use of artificial “ears” (17, 132). The pressure distribu- 
tion in the auditory canal was also obtained in a progressive sound field by 
use of the probe-tube technic (133). Earphone cushions, earphone sockets 
and headsets (36, 37, 41, 57, 116), and masks (74) were tested for their 
effects on communication. Measurement was made of physiological noise 
generated under earphone cushions (94). The characteristics of micro- 
phones (56) and noise shields (35, 131), amplifiers (80), radio link (21), 
and receivers (20, 33, 86) were investigated. Studies were made of radio 
equipment, interphone equipment (43, 46, 71, 80, 83), sound-powered 
equipment (38, 54, 134), and radio-range equipment (55). 


Effects of Noise on Psychomotor Efficiency 


One of the first military projects of the Psycho-Acoustic Laboratory was 
to study the effects on psychomotor efficiency of intense noise and vibration 
(123, 128). A battery of psychological, psychomotor, and physiological 
tests was developed and used to evaluate the effects of noise on a wide 
variety of tasks. In some of the experiments subjects were exposed to 115 db 
of airplane noise for seven-hour work days over a one-month period. The 
subjects reported the noise to be disagreeable and tiring, but their perform- 
ance was largely unimpaired by it. They had temporary hearing losses 
following exposure to noise—losses whose extent and duration depended 
upon the over-all intensity of the noise, its spectrum, and length of ex- 
posure. Other reports give fuller information in intense stimulation as the 
cause of temporary deafness, injury of the inner ear, and other physiological 
effects (25, 26). In one of the psychomotor tests the subjects’ coordinated 
serial reaction time showed an increase of 5 percent in the noise. This 
was actually the greatest effect of acoustic stress shown by any of the 
psychomotor tests, and the validity of this result is open to question. No 
other test showed significant decrements of performance due to noise. 
Indeterminate effects of noise were found in the following tests: muscular 
tension, metabolism, breathing, speed of accommodation, saccadic move- 
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ments, body sway, hand steadiness, reversible perspective, and dark adapta. 
tion. No effects of noise were found in these tests: coordinated seria] pur- 
suit, serial disjunctive reaction time, fast-speed pursuit rotor, card sorting. 
coding test, and judgment of distance. Vibration caused a considerab|; 
reduction in visual acuity in every subject. An extensive investigation was 


made of sound as a military weapon (19). Direct use of sound as a weapon 
was found to be impractical because it required too great an expenditur 
of energy. 


Noise Measurement 


Investigation of the effects of noise on communication and psychomoto 
efficiency required measurement of noise intensities and spectra. The prob- 
lems and methods which this entailed have been reviewed by Miller, Wiener. 
and Stevens (97). 


Combatting Noise 


Noise reduction, sound insulation, and aural protective devices wer 
used to lessen impairment of communications and to reduce annoyanc: 
(97). Airplane noise was attenuated by sound-proofing materials (29). 
Sound insulation was accomplished by proper design of earphone cushions. 
sockets, and headsets (34, 42, 45, 49, 57, 114, 116, 118) and by develop- 
ment of special insert tips for use with miniature earphones (117). Pro- 
tection against noise and gun blasts was afforded by special earplugs 
(70, 77, 115). Reception of speech slightly above normal levels was not 
impaired by the use of these earplugs; in noise, audibility of speech was. 
in some instances, improved by their use (72). 


Hearing Loss and Hearing Aids 


Several tests were developed for the direct measurement of hearing loss 
for speech (67, 105, 106, 108). These tests employ recordings of selected 
words and sentences, the loss being measured from standards set by normal 
subjects. The use of such test material transmitted thru filters may allow 
differential diagnosis of uniform losses and high-frequency hearing losses 
for speech (65). The tolerance of normal and hard-of-hearing subjects for 
intense sounds, both pure tones and speech, was determined. Thresholds 
of discomfort, of tickle, and of pain were measured (119). 

Commercial hearing aids were evaluated on the basis of electro-acousti 
and psycho-acoustic measurements (97, 102, 103). Studies of design ob- 
jectives for hearing aids (27, 28, 66) indicated the desirability of these 
properties: uniform frequency response between 300 and 4000 cps, limita- 
tion of maximum acoustic output, an effective gain control with a range 
of at least 40 db, and no acoustic feedback or electrical feedback ("squeal") 
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Auditory Signals for Instrument Flying 


The possibility of putting some airplane instrument indications in audi- 
tory form was investigated (51, 52). Auditory signals were devised which 
“sounded like the behavior of the airplane.” and which did not interfere 
with reception of radio-range signals or voice communication. An 


*“auto- 
matic annunciator 


was developed to translate instrument indications auto- 
matically into spoken messages and to announce them to the pilot. The 
annunciator had a readily identifiable speech quality, and there was little 
difficulty in distinguishing between it and outside speech sources. 
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